Code beginner · 3 min read

How to get response text from OpenAI API in python

Q: How to get response text from OpenAI API in python

Use the OpenAI Python SDK v1+ by calling client.chat.completions.create() with your messages, then access the response text via response.choices[0].message.content.

Direct answer

Use the OpenAI Python SDK v1+ by calling client.chat.completions.create() with your messages, then access the response text via response.choices[0].message.content.

Setup

Install

bash

pip install openai

Env vars

OPENAI_API_KEY

Imports

python

import os
from openai import OpenAI

Examples

inHello, how are you?

outI'm doing well, thank you! How can I assist you today?

inWrite a short poem about spring.

outSpring blooms anew, with colors bright and true, nature's gentle cue.

outPlease provide a prompt to generate a response.

Integration steps

Import the OpenAI client and load your API key from environment variables.
Initialize the OpenAI client with your API key.
Create a messages list with the user prompt.
Call client.chat.completions.create() with the model and messages.
Extract the response text from response.choices[0].message.content.

Full code

python

import os
from openai import OpenAI

# Initialize client with API key from environment
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

# Define the user message
messages = [{"role": "user", "content": "Hello, how are you?"}]

# Create chat completion
response = client.chat.completions.create(
    model="gpt-4o",
    messages=messages
)

# Extract and print the response text
text = response.choices[0].message.content
print("Response from OpenAI:", text)

output

Response from OpenAI: I'm doing well, thank you! How can I assist you today?

API trace

Request

json

{"model": "gpt-4o", "messages": [{"role": "user", "content": "Hello, how are you?"}]}

Response

json

{"choices": [{"message": {"content": "I'm doing well, thank you! How can I assist you today?"}}], "usage": {"total_tokens": 20}}

Extractresponse.choices[0].message.content

Variants

Streaming response ›

Use streaming to display the response token-by-token for better user experience with long outputs.

python

import os
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

messages = [{"role": "user", "content": "Tell me a story."}]

response = client.chat.completions.create(
    model="gpt-4o",
    messages=messages,
    stream=True
)

for chunk in response:
    print(chunk.choices[0].delta.get("content", ""), end="", flush=True)
print()

Async version ›

Use async calls to handle multiple concurrent requests efficiently in asynchronous Python applications.

python

import os
import asyncio
from openai import OpenAI

async def main():
    client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
    messages = [{"role": "user", "content": "Explain quantum computing."}]
    response = await client.chat.completions.acreate(
        model="gpt-4o",
        messages=messages
    )
    print(response.choices[0].message.content)

asyncio.run(main())

Alternative model (gpt-4o-mini) ›

Use a smaller model like gpt-4o-mini for faster responses and lower cost when high detail is not required.

python

import os
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
messages = [{"role": "user", "content": "Summarize the latest news."}]
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=messages
)
print(response.choices[0].message.content)

Performance

Latency~800ms for gpt-4o non-streaming

Cost~$0.002 per 500 tokens exchanged on gpt-4o

Rate limitsTier 1: 500 requests per minute / 30,000 tokens per minute

Keep prompts concise to reduce token usage.
Use smaller models like gpt-4o-mini for cheaper calls.
Cache frequent responses to avoid repeated calls.

Approach	Latency	Cost/call	Best for
Standard call	~800ms	~$0.002	General purpose, simple integration
Streaming	Starts immediately, total ~800ms	~$0.002	Long responses with better UX
Async call	~800ms	~$0.002	Concurrent requests in async apps

✓

Quick tip

Always extract the response text from <code>response.choices[0].message.content</code> to get the assistant's reply.

⚠

Common mistake

Beginners often forget to access <code>choices[0].message.content</code> and instead try to print the entire response object.

Verified 2026-04 · gpt-4o, gpt-4o-mini

Verify ↗