How to Intermediate · 4 min read

How to use GPT-4o for document extraction

Quick answer
Use gpt-4o with the OpenAI Python SDK to extract structured data from documents by sending the document text as a prompt and instructing the model to parse or summarize key information. The chat.completions.create method handles the request with a clear extraction prompt and returns the extracted content in the response.

PREREQUISITES

  • Python 3.8+
  • OpenAI API key (free tier works)
  • pip install openai>=1.0

Setup

Install the official OpenAI Python SDK and set your API key as an environment variable for secure authentication.

bash
pip install openai>=1.0

Step by step

This example shows how to extract key information from a document text using gpt-4o. The prompt instructs the model to parse the document and return structured data.

python
import os
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

document_text = """
Invoice #12345\nDate: 2026-04-01\nCustomer: Acme Corp\nTotal: $1,234.56\nItems:\n- Widget A x10 $10.00\n- Widget B x5 $20.00
"""

prompt = f"Extract the invoice number, date, customer name, total amount, and list of items from the following document:\n\n{document_text}\n\nReturn the data as JSON with keys: invoice_number, date, customer, total, items." 

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": prompt}]
)

extracted_data = response.choices[0].message.content
print("Extracted data:", extracted_data)
output
Extracted data: {
  "invoice_number": "12345",
  "date": "2026-04-01",
  "customer": "Acme Corp",
  "total": "$1,234.56",
  "items": [
    {"name": "Widget A", "quantity": 10, "price": "$10.00"},
    {"name": "Widget B", "quantity": 5, "price": "$20.00"}
  ]
}

Common variations

  • Use gpt-4o-mini for faster, lower-cost extraction with slightly less accuracy.
  • Implement streaming responses for large documents by setting stream=True in chat.completions.create.
  • Use asynchronous calls with asyncio and await for integration in async applications.
python
import asyncio
import os
from openai import OpenAI

async def async_extract():
    client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
    document_text = "Your document text here"
    prompt = f"Extract key info from:\n{document_text}\nReturn JSON."
    response = await client.chat.completions.acreate(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": prompt}],
        stream=True
    )
    async for chunk in response:
        print(chunk.choices[0].message.content, end="")

asyncio.run(async_extract())

Troubleshooting

  • If the model returns unstructured or incomplete data, refine your prompt to be more explicit about the output format (e.g., JSON schema).
  • If you get authentication errors, verify your OPENAI_API_KEY environment variable is set correctly.
  • For rate limits, implement exponential backoff retries or upgrade your API plan.

Key Takeaways

  • Use gpt-4o with clear prompts to extract structured data from documents efficiently.
  • Always set your API key securely via environment variables and use the latest OpenAI Python SDK.
  • Leverage streaming and async calls for handling large documents or integrating into async workflows.
Verified 2026-04 · gpt-4o, gpt-4o-mini
Verify ↗