How to use Instructor with OpenAI
Quick answer
Use the
instructor library to wrap the OpenAI client for structured extraction with Pydantic models. Define a BaseModel schema, then call client.chat.completions.create with response_model and your messages to get typed responses.PREREQUISITES
Python 3.8+OpenAI API key (free tier works)pip install openai>=1.0 instructor pydantic
Setup
Install the required packages and set your OpenAI API key as an environment variable.
- Install packages:
pip install openai instructor pydantic - Set environment variable:
export OPENAI_API_KEY='your_api_key'(Linux/macOS) orsetx OPENAI_API_KEY "your_api_key"(Windows)
pip install openai instructor pydantic Step by step
Define a Pydantic model for the structured data you want to extract, then use instructor.from_openai to create a client wrapping the OpenAI SDK. Call client.chat.completions.create with your prompt and the response_model to get typed output.
import os
from openai import OpenAI
import instructor
from pydantic import BaseModel
# Initialize OpenAI client
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
# Wrap OpenAI client with Instructor
instructor_client = instructor.from_openai(client)
# Define a Pydantic model for extraction
class User(BaseModel):
name: str
age: int
# Prepare messages
messages = [{"role": "user", "content": "Extract: John is 30 years old"}]
# Call the model with response_model for structured extraction
response = instructor_client.chat.completions.create(
model="gpt-4o-mini",
response_model=User,
messages=messages
)
# Access typed fields
print(f"Name: {response.name}, Age: {response.age}") output
Name: John, Age: 30
Common variations
You can use different OpenAI models like gpt-4o or gpt-4o-mini depending on your accuracy and cost needs. The instructor client also supports async calls if you prefer asynchronous programming. Additionally, you can use instructor.from_anthropic to wrap Anthropic clients similarly.
import asyncio
async def async_example():
response = await instructor_client.chat.completions.acreate(
model="gpt-4o-mini",
response_model=User,
messages=[{"role": "user", "content": "Extract: Alice is 25 years old"}]
)
print(f"Name: {response.name}, Age: {response.age}")
asyncio.run(async_example()) output
Name: Alice, Age: 25
Troubleshooting
- If you get a validation error, ensure your prompt clearly matches the expected data format of your Pydantic model.
- If the API key is missing or invalid, you will see authentication errors; verify
OPENAI_API_KEYis set correctly. - For unexpected output, try adding more context or examples in your prompt to guide the model.
Key Takeaways
- Use
instructor.from_openaito wrap the OpenAI client for structured extraction. - Define Pydantic
BaseModelclasses to specify the expected output schema. - Pass
response_modeltochat.completions.createfor typed responses. - Supports both synchronous and asynchronous usage patterns.
- Clear prompts aligned with your model schema improve extraction accuracy.