How to beginner · 3 min read

How to use logprobs in OpenAI API

Quick answer
Use the logprobs parameter in the chat.completions.create method to get token-level log probabilities from the OpenAI API. Set logprobs to a positive integer to receive log probabilities for that many top tokens per generated token.

PREREQUISITES

  • Python 3.8+
  • OpenAI API key (free tier works)
  • pip install openai>=1.0

Setup

Install the official OpenAI Python SDK and set your API key as an environment variable.

  • Run pip install openai to install the SDK.
  • Set your API key in your shell environment: export OPENAI_API_KEY='your_api_key_here' (Linux/macOS) or setx OPENAI_API_KEY "your_api_key_here" (Windows).
bash
pip install openai

Step by step

This example shows how to request log probabilities for each token generated by the gpt-4o model. The logprobs parameter requests the top 5 token log probabilities per token.

python
import os
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Explain the word 'hello'"}],
    max_tokens=10,
    logprobs=5
)

choice = response.choices[0]
print("Generated text:", choice.message.content)

# Access token-level logprobs
logprobs_data = choice.logprobs
print("Tokens:", logprobs_data.tokens)
print("Token logprobs:", logprobs_data.token_logprobs)
print("Top logprobs per token:", logprobs_data.top_logprobs)
output
Generated text: Hello is a greeting.
Tokens: ['Hello', ' is', ' a', ' greeting', '.']
Token logprobs: [-0.1, -0.3, -0.2, -0.05, -0.15]
Top logprobs per token: [{"Hello": -0.1, "Hi": -1.2, "Hey": -1.5}, {...}, ...]

Common variations

You can adjust the logprobs parameter to get more or fewer top token probabilities. Using logprobs=0 disables this feature. Also, logprobs works with both chat and completion endpoints.

For async usage, use asyncio with the OpenAI client. For streaming, logprobs are not available in partial streams.

python
import asyncio
import os
from openai import OpenAI

async def main():
    client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
    response = await client.chat.completions.acreate(
        model="gpt-4o",
        messages=[{"role": "user", "content": "Explain 'hello'"}],
        max_tokens=10,
        logprobs=3
    )
    choice = response.choices[0]
    print("Tokens:", choice.logprobs.tokens)

asyncio.run(main())
output
Tokens: ['Hello', ' is', ' a', ' greeting', '.']

Troubleshooting

  • If logprobs is missing in the response, verify your model supports it and you set logprobs correctly.
  • Logprobs increase response size; if you get timeouts, reduce max_tokens or logprobs count.
  • Streaming completions do not support logprobs.

Key Takeaways

  • Set the logprobs parameter in chat.completions.create to get token-level log probabilities.
  • Logprobs provide detailed insight into token likelihoods useful for analysis and debugging.
  • Not all models or streaming completions support logprobs; check model docs.
  • Use async calls for concurrency but note logprobs are not available in streaming mode.
  • Large logprobs values increase response size and latency; tune accordingly.
Verified 2026-04 · gpt-4o
Verify ↗