How to use GPT-4o for document extraction
Quick answer
Use
gpt-4o with the OpenAI Python SDK to extract structured data from documents by sending the document text as a prompt and instructing the model to parse or summarize key information. The chat.completions.create method handles the request with a clear extraction prompt and returns the extracted content in the response.PREREQUISITES
Python 3.8+OpenAI API key (free tier works)pip install openai>=1.0
Setup
Install the official OpenAI Python SDK and set your API key as an environment variable for secure authentication.
pip install openai>=1.0 Step by step
This example shows how to extract key information from a document text using gpt-4o. The prompt instructs the model to parse the document and return structured data.
import os
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
document_text = """
Invoice #12345\nDate: 2026-04-01\nCustomer: Acme Corp\nTotal: $1,234.56\nItems:\n- Widget A x10 $10.00\n- Widget B x5 $20.00
"""
prompt = f"Extract the invoice number, date, customer name, total amount, and list of items from the following document:\n\n{document_text}\n\nReturn the data as JSON with keys: invoice_number, date, customer, total, items."
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": prompt}]
)
extracted_data = response.choices[0].message.content
print("Extracted data:", extracted_data) output
Extracted data: {
"invoice_number": "12345",
"date": "2026-04-01",
"customer": "Acme Corp",
"total": "$1,234.56",
"items": [
{"name": "Widget A", "quantity": 10, "price": "$10.00"},
{"name": "Widget B", "quantity": 5, "price": "$20.00"}
]
} Common variations
- Use
gpt-4o-minifor faster, lower-cost extraction with slightly less accuracy. - Implement streaming responses for large documents by setting
stream=Trueinchat.completions.create. - Use asynchronous calls with
asyncioandawaitfor integration in async applications.
import asyncio
import os
from openai import OpenAI
async def async_extract():
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
document_text = "Your document text here"
prompt = f"Extract key info from:\n{document_text}\nReturn JSON."
response = await client.chat.completions.acreate(
model="gpt-4o-mini",
messages=[{"role": "user", "content": prompt}],
stream=True
)
async for chunk in response:
print(chunk.choices[0].message.content, end="")
asyncio.run(async_extract()) Troubleshooting
- If the model returns unstructured or incomplete data, refine your prompt to be more explicit about the output format (e.g., JSON schema).
- If you get authentication errors, verify your
OPENAI_API_KEYenvironment variable is set correctly. - For rate limits, implement exponential backoff retries or upgrade your API plan.
Key Takeaways
- Use
gpt-4owith clear prompts to extract structured data from documents efficiently. - Always set your API key securely via environment variables and use the latest OpenAI Python SDK.
- Leverage streaming and async calls for handling large documents or integrating into async workflows.