How to use OpenAI API in python
Direct answer
Use the
openai Python SDK v1+ by importing OpenAI, initializing the client with your API key from os.environ, then call client.chat.completions.create with your model and messages.Setup
Install
pip install openai Env vars
OPENAI_API_KEY Imports
import os
from openai import OpenAI Examples
inHello, how are you?
outI'm doing great, thank you! How can I assist you today?
inWrite a Python function to reverse a string.
outdef reverse_string(s):
return s[::-1]
inExplain quantum computing in simple terms.
outQuantum computing uses quantum bits that can be in multiple states at once, enabling faster problem solving for certain tasks.
Integration steps
- Install the OpenAI Python SDK with pip.
- Set your API key in the environment variable OPENAI_API_KEY.
- Import OpenAI and initialize the client with the API key from os.environ.
- Create a messages list with roles and content for the chat completion.
- Call client.chat.completions.create with the model and messages.
- Extract the response text from response.choices[0].message.content.
Full code
import os
from openai import OpenAI
# Initialize client with API key from environment
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
# Define chat messages
messages = [
{"role": "user", "content": "Hello, how are you?"}
]
# Create chat completion
response = client.chat.completions.create(
model="gpt-4o",
messages=messages
)
# Extract and print the assistant's reply
print("Assistant:", response.choices[0].message.content) output
Assistant: I'm doing great, thank you! How can I assist you today?
API trace
Request
{"model": "gpt-4o", "messages": [{"role": "user", "content": "Hello, how are you?"}]} Response
{"choices": [{"message": {"content": "I'm doing great, thank you! How can I assist you today?"}}], "usage": {"total_tokens": 15}} Extract
response.choices[0].message.contentVariants
Streaming Chat Completion ›
Use streaming to display partial responses in real-time for better user experience with long outputs.
import os
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
messages = [{"role": "user", "content": "Tell me a joke."}]
# Stream the response
for chunk in client.chat.completions.create(model="gpt-4o", messages=messages, stream=True):
print(chunk.choices[0].delta.get('content', ''), end='') Async Chat Completion ›
Use async calls to handle multiple concurrent requests efficiently in asynchronous Python applications.
import os
import asyncio
from openai import OpenAI
async def main():
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
messages = [{"role": "user", "content": "Explain recursion."}]
response = await client.chat.completions.acreate(model="gpt-4o", messages=messages)
print(response.choices[0].message.content)
asyncio.run(main()) Using a Smaller Model for Cost Efficiency ›
Use smaller models like gpt-4o-mini to reduce cost and latency when high precision is not critical.
import os
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
messages = [{"role": "user", "content": "Summarize the benefits of AI."}]
response = client.chat.completions.create(model="gpt-4o-mini", messages=messages)
print(response.choices[0].message.content) Performance
Latency~800ms for gpt-4o non-streaming calls
Cost~$0.002 per 500 tokens exchanged with gpt-4o
Rate limitsTier 1: 500 requests per minute / 30,000 tokens per minute
- Keep prompts concise to reduce token usage.
- Use smaller models for less critical tasks.
- Cache frequent queries to avoid repeated calls.
| Approach | Latency | Cost/call | Best for |
|---|---|---|---|
| Standard Chat Completion | ~800ms | ~$0.002 | General purpose, reliable |
| Streaming Chat Completion | Starts immediately, ~800ms total | ~$0.002 | Real-time UI updates |
| Async Chat Completion | ~800ms | ~$0.002 | Concurrent requests in async apps |
| Smaller Model (gpt-4o-mini) | ~400ms | ~$0.0005 | Cost-sensitive or low-latency needs |
Quick tip
Always load your API key securely from environment variables and never hardcode it in your source code.
Common mistake
Beginners often use deprecated SDK methods like openai.ChatCompletion.create() instead of the current client.chat.completions.create() pattern.