How to Intermediate · 4 min read

How to process charts and graphs with LLM

Q: How to process charts and graphs with LLM

Use a combination of image processing libraries like OpenCV or Pillow to extract chart elements, then feed extracted data or descriptions to an LLM such as gpt-4o via the OpenAI SDK for interpretation or summarization. Alternatively, use multimodal LLMs that accept images directly to analyze charts and graphs in one step.

Quick answer

Use a combination of image processing libraries like OpenCV or Pillow to extract chart elements, then feed extracted data or descriptions to an LLM such as gpt-4o via the OpenAI SDK for interpretation or summarization. Alternatively, use multimodal LLMs that accept images directly to analyze charts and graphs in one step.

PREREQUISITES

Python 3.8+
OpenAI API key (free tier works)
pip install openai>=1.0 opencv-python pillow matplotlib numpy

Setup

Install the necessary Python packages for image processing and OpenAI API access.

bash

pip install openai opencv-python pillow matplotlib numpy

Step by step

This example extracts basic chart data from an image using OpenCV and numpy, then sends a textual summary to gpt-4o for interpretation.

python

import os
import cv2
import numpy as np
from openai import OpenAI

# Initialize OpenAI client
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

# Load chart image
image_path = "chart.png"
image = cv2.imread(image_path)

# Convert to grayscale and threshold to isolate chart lines
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
_, thresh = cv2.threshold(gray, 200, 255, cv2.THRESH_BINARY_INV)

# Find contours which may correspond to bars or lines
contours, _ = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

# Extract bounding boxes and approximate data points
data_points = []
for cnt in contours:
    x, y, w, h = cv2.boundingRect(cnt)
    # Use height as value (example for bar chart)
    data_points.append((x, h))

# Sort data points by x position
data_points.sort(key=lambda p: p[0])

# Create a textual summary of extracted data
summary = "Extracted bar chart data points (x position, height):\n"
summary += "\n".join([f"Bar at {x}: height {h}" for x, h in data_points])

# Query LLM for interpretation
messages = [
    {"role": "user", "content": f"Here is the extracted chart data:\n{summary}\nPlease provide a summary and insights."}
]
response = client.chat.completions.create(
    model="gpt-4o",
    messages=messages
)
print(response.choices[0].message.content)

output

Bar chart summary and insights generated by the model, e.g.:
"The chart shows several bars with varying heights indicating different values. The tallest bar is at position X, suggesting the highest data point..."

Common variations

Use multimodal models like gpt-4o-mini with image input support to send charts directly for analysis.
Use Pillow for simpler image manipulations or matplotlib to recreate charts from extracted data.
Implement asynchronous calls with asyncio and the OpenAI SDK for faster processing.

python

import os
import asyncio
from openai import OpenAI

async def analyze_chart():
    client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
    with open("chart.png", "rb") as f:
        image_bytes = f.read()
    messages = [{"role": "user", "content": "Analyze this chart image."}]
    response = await client.chat.completions.create(
        model="gpt-4o-mini",
        messages=messages,
        files=[{"name": "chart.png", "data": image_bytes}],
        stream=False
    )
    print(response.choices[0].message.content)

asyncio.run(analyze_chart())

output

Model's textual analysis of the chart image.

Troubleshooting

If the chart extraction yields noisy or incorrect data, adjust image thresholding parameters or preprocess the image (e.g., denoising, resizing).
For large images, resize before sending to the LLM to avoid exceeding token or file size limits.
If the LLM response is vague, provide more structured extracted data or use prompt engineering to guide the model.

✅

Key Takeaways

Combine image processing libraries with LLMs to extract and interpret chart data effectively.
Use multimodal LLMs to analyze charts directly from images without manual data extraction.
Preprocess images carefully to improve data extraction accuracy before LLM input.

Verified 2026-04 · gpt-4o, gpt-4o-mini

Verify ↗