How to beginner · 4 min read

How to extract key-value pairs with LLM

Q: How to extract key-value pairs with LLM

Use a chat.completions.create call with a prompt instructing the LLM to parse text and output key-value pairs in JSON format. Then parse the JSON response in Python to extract the pairs cleanly.

Quick answer

Use a chat.completions.create call with a prompt instructing the LLM to parse text and output key-value pairs in JSON format. Then parse the JSON response in Python to extract the pairs cleanly.

PREREQUISITES

Python 3.8+
OpenAI API key (free tier works)
pip install openai>=1.0

Setup

Install the official openai Python SDK and set your API key as an environment variable.

Install SDK: pip install openai
Set environment variable in your shell: export OPENAI_API_KEY='your_api_key'

bash

pip install openai

output

Collecting openai
  Downloading openai-1.x.x-py3-none-any.whl (xx kB)
Installing collected packages: openai
Successfully installed openai-1.x.x

Step by step

This example sends a prompt to gpt-4o instructing it to extract key-value pairs from a given text and return them as JSON. The Python code then parses the JSON response.

python

import os
import json
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

text_to_parse = "Name: Alice, Age: 30, City: New York"

prompt = f"Extract the key-value pairs from the following text and return a JSON object:\n\n{text_to_parse}\n\nJSON:" 

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": prompt}]
)

json_text = response.choices[0].message.content.strip()

try:
    key_values = json.loads(json_text)
except json.JSONDecodeError:
    key_values = {}

print("Extracted key-value pairs:", key_values)

output

Extracted key-value pairs: {'Name': 'Alice', 'Age': 30, 'City': 'New York'}

Common variations

Use gpt-4o-mini or claude-3-5-sonnet-20241022 for cost-effective extraction.
For asynchronous calls, use asyncio with the OpenAI SDK's async methods.
To handle streaming, set stream=True and process chunks incrementally.
Adjust prompt instructions to extract nested or complex key-value structures.

Troubleshooting

If JSON parsing fails, verify the model's output format and consider adding explicit instructions to output valid JSON only.
Use print(json_text) to debug the raw response.
If keys or values are missing, refine the prompt to clarify expected output.
Check your API key and environment variable if you get authentication errors.

✅

Key Takeaways

Use explicit prompt instructions to get LLMs to output key-value pairs as JSON.
Parse the JSON response in Python to extract structured data reliably.
Choose models like gpt-4o-mini or claude-3-5-sonnet-20241022 for best accuracy and cost balance.
Test and refine prompts to handle complex or nested key-value extraction.
Use SDK v1+ patterns and environment variables for secure, production-ready code.

Verified 2026-04 · gpt-4o-mini, claude-3-5-sonnet-20241022

Verify ↗