How to beginner · 4 min read

Fix Instructor extraction wrong fields

Quick answer

To fix wrong field extraction in instructor, define a precise Pydantic BaseModel matching the expected response fields and pass it as response_model= in client.chat.completions.create. Ensure your messages prompt the model to output data conforming exactly to your model's schema.

PREREQUISITES

Python 3.8+
OpenAI API key (free tier works)
pip install openai>=1.0
pip install instructor pydantic

Setup

Install the required packages and set your OpenAI API key as an environment variable.

bash

pip install openai instructor pydantic

Step by step

Define a Pydantic model that exactly matches the fields you want to extract. Use instructor.from_openai to wrap the OpenAI client and pass the model as response_model in the chat completion call. Provide a clear extraction prompt.

python

import os
from openai import OpenAI
import instructor
from pydantic import BaseModel

# Define the Pydantic model matching expected extraction fields
class User(BaseModel):
    name: str
    age: int

# Initialize OpenAI client
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

# Wrap with instructor client
inst_client = instructor.from_openai(client)

# Prepare messages
messages = [{"role": "user", "content": "Extract: John is 30 years old"}]

# Call chat completion with response_model
user = inst_client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=User,
    messages=messages
)

print(user.name, user.age)

output

John 30

Common variations

You can use other models like gpt-4o or Anthropic models by wrapping their clients with instructor.from_anthropic. For asynchronous usage, use await with async functions. Always ensure your Pydantic model fields exactly match the expected output keys to avoid extraction errors.

python

import asyncio
import os
from openai import OpenAI
import instructor
from pydantic import BaseModel

class User(BaseModel):
    name: str
    age: int

async def main():
    client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
    inst_client = instructor.from_openai(client)
    messages = [{"role": "user", "content": "Extract: Alice is 25 years old"}]
    user = await inst_client.chat.completions.acreate(
        model="gpt-4o-mini",
        response_model=User,
        messages=messages
    )
    print(user.name, user.age)

asyncio.run(main())

output

Alice 25

Troubleshooting

If fields are missing or extraction fails, verify your Pydantic model field names exactly match the keys in the model's JSON output.
Ensure your prompt clearly instructs the model to output structured data matching your model.
Check that you use response_model= parameter and not response_format=.
Update instructor and openai packages to the latest versions to avoid compatibility issues.

✅

Key Takeaways

Define Pydantic models that exactly match the expected extraction fields to fix wrong field issues.
Use response_model= parameter in instructor chat completions for structured extraction.
Ensure prompts clearly instruct the model to output data conforming to your Pydantic schema.
Keep instructor and openai SDKs updated for best compatibility.
Use async calls with acreate for asynchronous extraction when needed.

Verified 2026-04 · gpt-4o-mini, gpt-4o, claude-3-5-sonnet-20241022

Verify ↗