How to Intermediate · 3 min read

How to extract medical information from records with AI

Quick answer
Use a large language model like gpt-4o to process medical records by prompting it to extract structured data such as diagnoses, medications, and patient details. Combine OpenAI API calls with prompt engineering and optionally fine-tuning or few-shot learning for higher accuracy.

PREREQUISITES

  • Python 3.8+
  • OpenAI API key (free tier works)
  • pip install openai>=1.0

Setup

Install the openai Python package and set your API key as an environment variable for secure access.

bash
pip install openai
output
Collecting openai
  Downloading openai-1.x.x-py3-none-any.whl
Installing collected packages: openai
Successfully installed openai-1.x.x

Step by step

This example shows how to send a medical record text to gpt-4o and extract key medical information like patient name, age, diagnosis, and medications using prompt engineering.

python
import os
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

medical_record = '''Patient Name: John Doe\nAge: 45\nChief Complaint: Persistent cough and fever\nDiagnosis: Acute bronchitis\nMedications: Amoxicillin 500mg, Paracetamol 650mg\n'''

prompt = f"Extract the following information from the medical record:\n- Patient Name\n- Age\n- Diagnosis\n- Medications\n\nMedical record:\n{medical_record}\n\nProvide the output as JSON."

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": prompt}]
)

print("Extracted medical info:", response.choices[0].message.content)
output
Extracted medical info: {
  "Patient Name": "John Doe",
  "Age": 45,
  "Diagnosis": "Acute bronchitis",
  "Medications": ["Amoxicillin 500mg", "Paracetamol 650mg"]
}

Common variations

You can use asynchronous calls with asyncio for batch processing, switch to other models like claude-3-5-sonnet-20241022 for better medical domain understanding, or apply few-shot prompting with examples to improve extraction accuracy.

python
import os
import asyncio
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

async def extract_medical_info_async(record: str):
    prompt = f"Extract patient name, age, diagnosis, and medications from the medical record:\n{record}\nReturn JSON."
    response = await client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": prompt}]
    )
    return response.choices[0].message.content

async def main():
    record = "Patient Name: Jane Smith\nAge: 60\nDiagnosis: Hypertension\nMedications: Lisinopril 10mg"
    result = await extract_medical_info_async(record)
    print("Async extracted info:", result)

asyncio.run(main())
output
Async extracted info: {
  "Patient Name": "Jane Smith",
  "Age": 60,
  "Diagnosis": "Hypertension",
  "Medications": ["Lisinopril 10mg"]
}

Troubleshooting

  • If the model returns incomplete or ambiguous data, improve prompt clarity or add few-shot examples.
  • For sensitive medical data, ensure compliance with HIPAA and use secure environments.
  • If you hit rate limits, implement exponential backoff or batch requests.

Key Takeaways

  • Use prompt engineering to guide gpt-4o in extracting structured medical data from unstructured records.
  • Asynchronous API calls enable scalable processing of multiple records efficiently.
  • Fine-tuning or few-shot learning can improve accuracy for domain-specific medical extraction tasks.
Verified 2026-04 · gpt-4o, gpt-4o-mini, claude-3-5-sonnet-20241022
Verify ↗