How to Beginner to Intermediate · 4 min read

How to use AI to analyze CSV files

Quick answer
Use a Python script to read CSV files and send relevant data or summaries as prompts to an AI model like gpt-4o via the OpenAI API. The AI can then analyze, summarize, or generate insights from the CSV content based on your instructions.

PREREQUISITES

  • Python 3.8+
  • OpenAI API key (free tier works)
  • pip install openai>=1.0
  • Basic knowledge of CSV and Python file handling

Setup

Install the OpenAI Python SDK and set your API key as an environment variable to authenticate requests.

bash
pip install openai>=1.0

Step by step

This example reads a CSV file, extracts the first few rows as text, and sends it to gpt-4o to get a summary analysis.

python
import os
import csv
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

# Read CSV and convert first 5 rows to text
with open('data.csv', 'r', encoding='utf-8') as f:
    reader = csv.reader(f)
    headers = next(reader)
    rows = [next(reader) for _ in range(5)]

# Prepare prompt with CSV snippet
csv_text = ", ".join(headers) + "\n"
csv_text += "\n".join([", ".join(row) for row in rows])

prompt = f"Analyze the following CSV data and provide a summary of key insights:\n{csv_text}" 

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": prompt}]
)

print(response.choices[0].message.content)
output
Summary: The CSV data shows columns for Name, Age, Department, Salary, and Start Date. Among the first 5 entries, the average age is 29, with most employees in the Sales and Engineering departments. Salaries range from $50,000 to $85,000, indicating a mid-level workforce.

Common variations

You can use asynchronous calls for large CSVs, stream responses for real-time analysis, or switch to other models like claude-3-5-sonnet-20241022 for potentially better coding and data understanding.

python
import os
import csv
import asyncio
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

async def analyze_csv_async():
    with open('data.csv', 'r', encoding='utf-8') as f:
        reader = csv.reader(f)
        headers = next(reader)
        rows = [next(reader) for _ in range(5)]

    csv_text = ", ".join(headers) + "\n"
    csv_text += "\n".join([", ".join(row) for row in rows])

    prompt = f"Analyze the following CSV data asynchronously and summarize key points:\n{csv_text}"

    response = await client.chat.completions.acreate(
        model="gpt-4o",
        messages=[{"role": "user", "content": prompt}]
    )

    print(response.choices[0].message.content)

asyncio.run(analyze_csv_async())
output
Summary: The CSV snippet includes employee details with an average age of 29 and a concentration in Sales and Engineering departments.

Troubleshooting

  • If you get an authentication error, verify your OPENAI_API_KEY environment variable is set correctly.
  • If the CSV is too large, consider sending only relevant slices or summaries to the AI to avoid token limits.
  • For unexpected output, refine your prompt to be more specific about the analysis you want.

Key Takeaways

  • Use Python's CSV module to read and preprocess CSV data before sending it to an AI model.
  • Send concise CSV snippets as prompt text to gpt-4o or similar models for analysis and summarization.
  • Async calls and streaming can improve performance for large datasets or interactive use cases.
Verified 2026-04 · gpt-4o, claude-3-5-sonnet-20241022
Verify ↗