How to use logprobs in OpenAI API
Quick answer
Use the
logprobs parameter in the chat.completions.create method to get token-level log probabilities from the OpenAI API. Set logprobs to a positive integer to receive log probabilities for that many top tokens per generated token.PREREQUISITES
Python 3.8+OpenAI API key (free tier works)pip install openai>=1.0
Setup
Install the official OpenAI Python SDK and set your API key as an environment variable.
- Run
pip install openaito install the SDK. - Set your API key in your shell environment:
export OPENAI_API_KEY='your_api_key_here'(Linux/macOS) orsetx OPENAI_API_KEY "your_api_key_here"(Windows).
pip install openai Step by step
This example shows how to request log probabilities for each token generated by the gpt-4o model. The logprobs parameter requests the top 5 token log probabilities per token.
import os
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Explain the word 'hello'"}],
max_tokens=10,
logprobs=5
)
choice = response.choices[0]
print("Generated text:", choice.message.content)
# Access token-level logprobs
logprobs_data = choice.logprobs
print("Tokens:", logprobs_data.tokens)
print("Token logprobs:", logprobs_data.token_logprobs)
print("Top logprobs per token:", logprobs_data.top_logprobs) output
Generated text: Hello is a greeting.
Tokens: ['Hello', ' is', ' a', ' greeting', '.']
Token logprobs: [-0.1, -0.3, -0.2, -0.05, -0.15]
Top logprobs per token: [{"Hello": -0.1, "Hi": -1.2, "Hey": -1.5}, {...}, ...] Common variations
You can adjust the logprobs parameter to get more or fewer top token probabilities. Using logprobs=0 disables this feature. Also, logprobs works with both chat and completion endpoints.
For async usage, use asyncio with the OpenAI client. For streaming, logprobs are not available in partial streams.
import asyncio
import os
from openai import OpenAI
async def main():
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
response = await client.chat.completions.acreate(
model="gpt-4o",
messages=[{"role": "user", "content": "Explain 'hello'"}],
max_tokens=10,
logprobs=3
)
choice = response.choices[0]
print("Tokens:", choice.logprobs.tokens)
asyncio.run(main()) output
Tokens: ['Hello', ' is', ' a', ' greeting', '.']
Troubleshooting
- If
logprobsis missing in the response, verify your model supports it and you setlogprobscorrectly. - Logprobs increase response size; if you get timeouts, reduce
max_tokensorlogprobscount. - Streaming completions do not support
logprobs.
Key Takeaways
- Set the
logprobsparameter inchat.completions.createto get token-level log probabilities. - Logprobs provide detailed insight into token likelihoods useful for analysis and debugging.
- Not all models or streaming completions support
logprobs; check model docs. - Use async calls for concurrency but note
logprobsare not available in streaming mode. - Large
logprobsvalues increase response size and latency; tune accordingly.