How to make AI return structured data
Quick answer
To make AI return structured data, use explicit instructions in your prompt specifying the desired format like JSON or CSV, and provide a clear schema or example. Use models such as
gpt-4o with well-defined system and user messages to ensure consistent structured output.PREREQUISITES
Python 3.8+OpenAI API key (free tier works)pip install openai>=1.0
Setup
Install the openai Python package and set your API key as an environment variable for secure access.
pip install openai>=1.0 Step by step
Use the gpt-4o model with a prompt that explicitly requests JSON output. Provide a schema example in the prompt to guide the AI.
import os
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
prompt = '''
Return the following data as JSON with keys: name (string), age (integer), and city (string).
Example:
{
"name": "Alice",
"age": 30,
"city": "New York"
}
Data:
Name: Bob
Age: 25
City: San Francisco
'''
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": prompt}]
)
print(response.choices[0].message.content) output
{
"name": "Bob",
"age": 25,
"city": "San Francisco"
} Common variations
You can use other models like claude-3-5-sonnet-20241022 or enable streaming for large outputs. Async calls improve performance in web apps.
import anthropic
import os
client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])
system_prompt = "You are a helpful assistant that returns data strictly in JSON format."
user_prompt = '''
Return the following data as JSON with keys: name, age, city.
Data:
Name: Carol
Age: 40
City: Boston
'''
message = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=200,
system=system_prompt,
messages=[{"role": "user", "content": user_prompt}]
)
print(message.content[0].text) output
{
"name": "Carol",
"age": 40,
"city": "Boston"
} Troubleshooting
If the AI returns unstructured text or extra commentary, reinforce the prompt by adding instructions like "Return only JSON, no explanations." Use explicit delimiters or JSON schema validation in your application to catch errors.
prompt = '''
Return ONLY the JSON object with keys: name, age, city. No extra text or explanation.
Data:
Name: Dave
Age: 28
City: Seattle
'''
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": prompt}]
)
print(response.choices[0].message.content) output
{
"name": "Dave",
"age": 28,
"city": "Seattle"
} Key Takeaways
- Always specify the exact structured format (e.g., JSON) and provide an example in your prompt.
- Use explicit instructions like 'Return only JSON, no explanations' to avoid unstructured output.
- Test with different models such as
gpt-4oorclaude-3-5-sonnet-20241022for best results. - Validate the AI output programmatically to handle any formatting errors.
- Use environment variables for API keys and the latest SDK patterns for secure, maintainable code.