Fix LLM hallucinating in summaries
Quick answer
To fix hallucinations in LLM-generated summaries, use precise prompt engineering with explicit instructions and context, and select reliable models like
gpt-4o or claude-sonnet-4-5. Additionally, apply API parameters such as temperature=0 and max_tokens limits to reduce creativity and enforce factuality.PREREQUISITES
Python 3.8+OpenAI API key (free tier works)pip install openai>=1.0
Setup
Install the openai Python package and set your API key as an environment variable.
- Install package:
pip install openai - Set environment variable in your shell:
export OPENAI_API_KEY='your_api_key'
pip install openai output
Collecting openai Downloading openai-1.0.0-py3-none-any.whl (50 kB) Installing collected packages: openai Successfully installed openai-1.0.0
Step by step
Use explicit prompt instructions and set temperature=0 to minimize hallucinations. Provide relevant context and limit output length.
import os
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
prompt = (
"Summarize the following text accurately without adding any information or speculation:\n"
"\nText: "
"The Apollo 11 mission landed the first humans on the Moon in 1969."
)
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": prompt}],
temperature=0,
max_tokens=100
)
summary = response.choices[0].message.content
print("Summary:", summary) output
Summary: Apollo 11 was the first mission to land humans on the Moon in 1969.
Common variations
You can use other reliable models like claude-sonnet-4-5 with the Anthropic SDK or adjust parameters such as top_p and frequency_penalty to reduce hallucinations. Async calls and streaming are also supported.
from anthropic import Anthropic
import os
client = Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])
system_prompt = "You are a helpful assistant that summarizes text accurately without hallucinating."
user_message = (
"Summarize the following text without adding any information or speculation:\n"
"The Apollo 11 mission landed the first humans on the Moon in 1969."
)
response = client.messages.create(
model="claude-sonnet-4-5",
max_tokens=100,
system=system_prompt,
messages=[{"role": "user", "content": user_message}]
)
print("Summary:", response.content) output
Summary: Apollo 11 was the first mission to land humans on the Moon in 1969.
Troubleshooting
- If summaries still hallucinate, reduce
temperaturefurther and increase prompt clarity. - Provide more context or source text to the model.
- Use
max_tokensto limit output length and avoid verbose speculation. - Test different models known for factual accuracy like
gpt-4oorclaude-sonnet-4-5.
Key Takeaways
- Use explicit, clear prompts instructing the model to avoid adding information.
- Set
temperature=0to reduce creativity and hallucinations in summaries. - Choose reliable models like
gpt-4oorclaude-sonnet-4-5for factual accuracy. - Limit output length with
max_tokensto prevent verbose or speculative text. - Provide sufficient context or source text to guide the model's summary.