How to Intermediate · 4 min read

How to use GPT-4o for document extraction

Q: How to use GPT-4o for document extraction

Use gpt-4o with the OpenAI Python SDK to extract structured data from documents by sending the document text as a prompt and instructing the model to parse or summarize key information. The chat.completions.create method handles the request with a clear extraction prompt and returns the extracted content in the response.

Quick answer

Use gpt-4o with the OpenAI Python SDK to extract structured data from documents by sending the document text as a prompt and instructing the model to parse or summarize key information. The chat.completions.create method handles the request with a clear extraction prompt and returns the extracted content in the response.

PREREQUISITES

Python 3.8+
OpenAI API key (free tier works)
pip install openai>=1.0

Setup

Install the official OpenAI Python SDK and set your API key as an environment variable for secure authentication.

bash

pip install openai>=1.0

Step by step

This example shows how to extract key information from a document text using gpt-4o. The prompt instructs the model to parse the document and return structured data.

python

import os
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

document_text = """
Invoice #12345\nDate: 2026-04-01\nCustomer: Acme Corp\nTotal: $1,234.56\nItems:\n- Widget A x10 $10.00\n- Widget B x5 $20.00
"""

prompt = f"Extract the invoice number, date, customer name, total amount, and list of items from the following document:\n\n{document_text}\n\nReturn the data as JSON with keys: invoice_number, date, customer, total, items." 

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": prompt}]
)

extracted_data = response.choices[0].message.content
print("Extracted data:", extracted_data)

output

Extracted data: {
  "invoice_number": "12345",
  "date": "2026-04-01",
  "customer": "Acme Corp",
  "total": "$1,234.56",
  "items": [
    {"name": "Widget A", "quantity": 10, "price": "$10.00"},
    {"name": "Widget B", "quantity": 5, "price": "$20.00"}
  ]
}

Common variations

Use gpt-4o-mini for faster, lower-cost extraction with slightly less accuracy.
Implement streaming responses for large documents by setting stream=True in chat.completions.create.
Use asynchronous calls with asyncio and await for integration in async applications.

python

import asyncio
import os
from openai import OpenAI

async def async_extract():
    client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
    document_text = "Your document text here"
    prompt = f"Extract key info from:\n{document_text}\nReturn JSON."
    response = await client.chat.completions.acreate(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": prompt}],
        stream=True
    )
    async for chunk in response:
        print(chunk.choices[0].message.content, end="")

asyncio.run(async_extract())

Troubleshooting

If the model returns unstructured or incomplete data, refine your prompt to be more explicit about the output format (e.g., JSON schema).
If you get authentication errors, verify your OPENAI_API_KEY environment variable is set correctly.
For rate limits, implement exponential backoff retries or upgrade your API plan.

✅

Key Takeaways

Use gpt-4o with clear prompts to extract structured data from documents efficiently.
Always set your API key securely via environment variables and use the latest OpenAI Python SDK.
Leverage streaming and async calls for handling large documents or integrating into async workflows.

Verified 2026-04 · gpt-4o, gpt-4o-mini

Verify ↗