How Instructor works explained
Quick answer
The
Instructor Python SDK enables structured extraction by combining AI chat completions with pydantic models. You define a response_model to parse AI responses into typed Python objects, simplifying data extraction from natural language.PREREQUISITES
Python 3.8+OpenAI API key (free tier works)pip install openai>=1.0 instructor pydantic
Setup
Install the instructor package along with openai and pydantic. Set your OpenAI API key as an environment variable.
- Run
pip install openai instructor pydantic - Export your API key:
export OPENAI_API_KEY='your_key_here'
pip install openai instructor pydantic Step by step
Define a pydantic.BaseModel for the structured data you want to extract. Use instructor.from_openai to wrap the OpenAI client. Call chat.completions.create with your model, messages, and the response_model. The SDK parses the AI response into your model instance.
import os
from pydantic import BaseModel
from openai import OpenAI
import instructor
# Define the structured response model
class User(BaseModel):
name: str
age: int
# Initialize OpenAI client
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
# Wrap with Instructor
instructor_client = instructor.from_openai(client)
# Create chat completion with response_model
response = instructor_client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Extract: John is 30 years old"}],
response_model=User
)
# Access structured data
print(response.name) # John
print(response.age) # 30 output
John 30
Common variations
You can use instructor.from_anthropic to wrap an Anthropic client similarly. Change the model to any supported OpenAI or Anthropic chat model. Async usage is not currently supported. You can define complex nested pydantic models for richer extraction.
import os
import anthropic
import instructor
from pydantic import BaseModel
class User(BaseModel):
name: str
age: int
client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])
instructor_client = instructor.from_anthropic(client)
response = instructor_client.chat.completions.create(
model="claude-3-5-sonnet-20241022",
system="You are a helpful assistant.",
messages=[{"role": "user", "content": "Extract: Alice is 25 years old"}],
response_model=User
)
print(response.name) # Alice
print(response.age) # 25 output
Alice 25
Troubleshooting
- If you get validation errors, ensure your
pydanticmodel matches the expected AI response format. - If the AI returns unexpected text, try refining your prompt for clearer extraction instructions.
- Check your API key environment variable is set correctly to avoid authentication errors.
Key Takeaways
- Use
instructorto parse AI chat responses into typed Python objects withpydanticmodels. - Wrap your OpenAI or Anthropic client with
instructor.from_openaiorinstructor.from_anthropicfor seamless integration. - Define clear
response_modelschemas to ensure reliable structured extraction from natural language. - Refine prompts and validate models to handle AI response variations and avoid parsing errors.