How to beginner · 3 min read

How to extract entities with Instructor

Q: How to extract entities with Instructor

Use the instructor Python library with an OpenAI OpenAI client to extract entities by defining a pydantic.BaseModel for the expected structure and passing it as response_model in client.chat.completions.create. This enables structured entity extraction from text prompts.

Quick answer

Use the instructor Python library with an OpenAI OpenAI client to extract entities by defining a pydantic.BaseModel for the expected structure and passing it as response_model in client.chat.completions.create. This enables structured entity extraction from text prompts.

PREREQUISITES

Python 3.8+
OpenAI API key (free tier works)
pip install openai>=1.0 instructor pydantic

Setup

Install the required packages and set your OpenAI API key as an environment variable.

Install packages: pip install openai instructor pydantic
Set environment variable: export OPENAI_API_KEY='your_api_key' (Linux/macOS) or setx OPENAI_API_KEY "your_api_key" (Windows)

bash

pip install openai instructor pydantic

Step by step

Define a pydantic.BaseModel to specify the entity fields you want to extract. Use instructor.from_openai to wrap the OpenAI client, then call chat.completions.create with response_model to parse the output into structured entities.

python

import os
from openai import OpenAI
import instructor
from pydantic import BaseModel

# Define the entity extraction schema
class Entities(BaseModel):
    person: str
    organization: str
    location: str

# Initialize OpenAI client
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

# Wrap OpenAI client with Instructor
inst_client = instructor.from_openai(client)

# Input text to extract entities from
text = "John Doe works at OpenAI in San Francisco."

# Call chat completion with response_model for structured extraction
response = inst_client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": f"Extract person, organization, and location from this text: {text}"}],
    response_model=Entities
)

# Access extracted entities
entities = response
print(f"Person: {entities.person}")
print(f"Organization: {entities.organization}")
print(f"Location: {entities.location}")

output

Person: John Doe
Organization: OpenAI
Location: San Francisco

Common variations

You can use different models like gpt-4o for higher accuracy or gpt-4o-mini for cost efficiency. The instructor library also supports Anthropic clients via instructor.from_anthropic. For asynchronous usage, use await with async clients.

python

import asyncio
import os
from openai import OpenAI
import instructor
from pydantic import BaseModel

class Entities(BaseModel):
    person: str
    organization: str
    location: str

async def async_entity_extraction():
    client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
    inst_client = instructor.from_openai(client)

    text = "Alice works at Anthropic in San Francisco."

    response = await inst_client.chat.completions.acreate(
        model="gpt-4o",
        messages=[{"role": "user", "content": f"Extract person, organization, and location from this text: {text}"}],
        response_model=Entities
    )

    print(f"Person: {response.person}")
    print(f"Organization: {response.organization}")
    print(f"Location: {response.location}")

asyncio.run(async_entity_extraction())

output

Person: Alice
Organization: Anthropic
Location: San Francisco

Troubleshooting

If you get validation errors, ensure your pydantic.BaseModel matches the expected output format.
If the model returns unstructured text, try refining the prompt to explicitly request JSON or structured output.
Check your OPENAI_API_KEY environment variable is set correctly.

✅

Key Takeaways

Use instructor with a pydantic.BaseModel to extract structured entities from text.
Pass response_model to chat.completions.create for automatic parsing.
Refine prompts to improve extraction accuracy and output format.
Supports both synchronous and asynchronous usage with OpenAI clients.
Always set your API key securely via environment variables.

Verified 2026-04 · gpt-4o-mini, gpt-4o

Verify ↗