How to beginner · 3 min read

How to extract lists with Instructor

Q: How to extract lists with Instructor

Use the instructor Python library with a Pydantic BaseModel defining a list field. Call client.chat.completions.create with response_model set to your model to extract lists from text in a structured way.

Quick answer

Use the instructor Python library with a Pydantic BaseModel defining a list field. Call client.chat.completions.create with response_model set to your model to extract lists from text in a structured way.

PREREQUISITES

Python 3.8+
OpenAI API key (free tier works)
pip install openai>=1.0 instructor pydantic

Setup

Install the required packages and set your OpenAI API key in the environment.

Install with pip install openai instructor pydantic
Set environment variable OPENAI_API_KEY with your API key.

bash

pip install openai instructor pydantic

Step by step

Define a Pydantic model with a list field, then use instructor to extract the list from text via OpenAI chat completion.

python

import os
from openai import OpenAI
import instructor
from pydantic import BaseModel

# Initialize OpenAI client
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

# Wrap OpenAI client with Instructor
instructor_client = instructor.from_openai(client)

# Define Pydantic model with a list field
class ShoppingList(BaseModel):
    items: list[str]

# Input text containing a list
text = "Extract the shopping list: apples, bananas, oranges, and milk."

# Call chat completion with response_model to extract list
response = instructor_client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": text}],
    response_model=ShoppingList
)

# Access extracted list
print("Extracted items:", response.items)

output

Extracted items: ['apples', 'bananas', 'oranges', 'milk']

Common variations

Use async calls with await instructor_client.chat.completions.acreate(...).
Change model to gpt-4o or claude-3-5-sonnet-20241022 for higher accuracy.
Extract nested lists or complex structures by defining nested Pydantic models.

python

import asyncio

async def async_extract():
    response = await instructor_client.chat.completions.acreate(
        model="gpt-4o",
        messages=[{"role": "user", "content": text}],
        response_model=ShoppingList
    )
    print("Async extracted items:", response.items)

asyncio.run(async_extract())

output

Async extracted items: ['apples', 'bananas', 'oranges', 'milk']

Troubleshooting

If the list extraction is incomplete or incorrect, try increasing max_tokens or using a stronger model like gpt-4o.
Ensure your Pydantic model matches the expected output format exactly.
If you get validation errors, check the input prompt clarity and model choice.

Key Takeaways

Use instructor with Pydantic models to extract structured lists from text.
Set response_model in chat.completions.create for automatic parsing.
Async extraction and stronger models improve accuracy and flexibility.

Verified 2026-04 · gpt-4o-mini, gpt-4o, claude-3-5-sonnet-20241022

Verify ↗

Community Notes

No notes yetBe the first to share a version-specific fix or tip.