Code beginner · 3 min read

How to use OpenAI API in python

Q: How to use OpenAI API in python

Use the openai Python SDK v1+ by importing OpenAI, initializing the client with your API key from os.environ, then call client.chat.completions.create with your model and messages.

Direct answer

Use the openai Python SDK v1+ by importing OpenAI, initializing the client with your API key from os.environ, then call client.chat.completions.create with your model and messages.

Setup

Install

bash

pip install openai

Env vars

OPENAI_API_KEY

Imports

python

import os
from openai import OpenAI

Examples

inHello, how are you?

outI'm doing great, thank you! How can I assist you today?

inWrite a Python function to reverse a string.

outdef reverse_string(s): return s[::-1]

inExplain quantum computing in simple terms.

outQuantum computing uses quantum bits that can be in multiple states at once, enabling faster problem solving for certain tasks.

Integration steps

Install the OpenAI Python SDK with pip.
Set your API key in the environment variable OPENAI_API_KEY.
Import OpenAI and initialize the client with the API key from os.environ.
Create a messages list with roles and content for the chat completion.
Call client.chat.completions.create with the model and messages.
Extract the response text from response.choices[0].message.content.

Full code

python

import os
from openai import OpenAI

# Initialize client with API key from environment
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

# Define chat messages
messages = [
    {"role": "user", "content": "Hello, how are you?"}
]

# Create chat completion
response = client.chat.completions.create(
    model="gpt-4o",
    messages=messages
)

# Extract and print the assistant's reply
print("Assistant:", response.choices[0].message.content)

output

Assistant: I'm doing great, thank you! How can I assist you today?

API trace

Request

json

{"model": "gpt-4o", "messages": [{"role": "user", "content": "Hello, how are you?"}]}

Response

json

{"choices": [{"message": {"content": "I'm doing great, thank you! How can I assist you today?"}}], "usage": {"total_tokens": 15}}

Extractresponse.choices[0].message.content

Variants

Streaming Chat Completion ›

Use streaming to display partial responses in real-time for better user experience with long outputs.

python

import os
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

messages = [{"role": "user", "content": "Tell me a joke."}]

# Stream the response
for chunk in client.chat.completions.create(model="gpt-4o", messages=messages, stream=True):
    print(chunk.choices[0].delta.get('content', ''), end='')

Async Chat Completion ›

Use async calls to handle multiple concurrent requests efficiently in asynchronous Python applications.

python

import os
import asyncio
from openai import OpenAI

async def main():
    client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
    messages = [{"role": "user", "content": "Explain recursion."}]
    response = await client.chat.completions.acreate(model="gpt-4o", messages=messages)
    print(response.choices[0].message.content)

asyncio.run(main())

Using a Smaller Model for Cost Efficiency ›

Use smaller models like gpt-4o-mini to reduce cost and latency when high precision is not critical.

python

import os
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

messages = [{"role": "user", "content": "Summarize the benefits of AI."}]

response = client.chat.completions.create(model="gpt-4o-mini", messages=messages)
print(response.choices[0].message.content)

Performance

Latency~800ms for gpt-4o non-streaming calls

Cost~$0.002 per 500 tokens exchanged with gpt-4o

Rate limitsTier 1: 500 requests per minute / 30,000 tokens per minute

Keep prompts concise to reduce token usage.
Use smaller models for less critical tasks.
Cache frequent queries to avoid repeated calls.

Approach	Latency	Cost/call	Best for
Standard Chat Completion	~800ms	~$0.002	General purpose, reliable
Streaming Chat Completion	Starts immediately, ~800ms total	~$0.002	Real-time UI updates
Async Chat Completion	~800ms	~$0.002	Concurrent requests in async apps
Smaller Model (gpt-4o-mini)	~400ms	~$0.0005	Cost-sensitive or low-latency needs

✓

Quick tip

Always load your API key securely from environment variables and never hardcode it in your source code.

⚠

Common mistake

Beginners often use deprecated SDK methods like openai.ChatCompletion.create() instead of the current client.chat.completions.create() pattern.

Verified 2026-04 · gpt-4o, gpt-4o-mini

Verify ↗

Community Notes

No notes yetBe the first to share a version-specific fix or tip.