API Intermediate medium · 6 min

Use cases: data analysis, math, plotting

What you will learn

Use Gemini to analyze datasets, solve mathematical problems, and generate plotting code by passing raw data and requesting structured code output.

Why this matters

Gemini can reason about numerical data and generate executable Python for analysis without requiring a separate data science library call: useful for exploratory analysis, quick calculations, and auto-generating visualization code from raw datasets.

Skip if: If you need guaranteed mathematical correctness (financial calculations, scientific research), don't rely solely on Gemini without verification. If your data is proprietary or sensitive, reconsider sending it to an external API. Use local pandas/NumPy directly if latency matters.

Explanation

What it does: Gemini can accept raw data (CSV-like text, JSON, or plain tables) and generate analysis, perform math, or write plotting code. You send the data in your prompt and Gemini returns executable Python or mathematical answers.

How it works: The model processes your data as text, understands the structure and intent, then generates Python code (using matplotlib, seaborn, pandas) or returns calculated results. Unlike a statistics API, Gemini doesn't compute: it generates code you execute or explains math you verify. This is token-efficient for small datasets but becomes expensive for large CSVs (tokens = file size).

When to use it: Quick exploratory analysis, auto-generating boilerplate plotting code, explaining mathematical approaches, handling ad-hoc data formats the model can parse. Not for production pipelines requiring bulletproof math or handling multi-GB datasets.

Request code

Illustrative only - not runnable without a valid API key

python

import google.generativeai as genai
import os

genai.configure(api_key=os.environ['GOOGLE_API_KEY'])
model = genai.GenerativeModel('gemini-2.0-flash')

data = """Date,Sales,Region
2024-01-01,15000,North
2024-01-02,18500,South
2024-01-03,12000,East
2024-01-04,22000,West
2024-01-05,19500,North"""

prompt = f"""Analyze this sales data and generate Python code to plot daily sales by region using matplotlib. Return only executable Python code, no explanation.

{data}"""

response = model.generate_content(prompt)
print(response.text)

Authentication

Set your Google API key before instantiation. Gemini reads it at model creation time: `export GOOGLE_API_KEY='your-key'` in your shell, or `os.environ['GOOGLE_API_KEY'] = 'your-key'` in Python before calling `genai.configure()`.

Response shape

Field	Description
`text`	string containing the generated analysis, code, or mathematical explanation
`usage_metadata`	[object Object]
`finish_reason`	string indicating why generation stopped (STOP, MAX_TOKENS, etc.)

Field guide

text

Your main output: for data analysis use cases, this will be executable Python code or written analysis. Always validate the code before running it.

usage_metadata

Critical for cost tracking: multiply total_tokens by 0.075 (per 1M tokens for gemini-2.0-flash input) to estimate API cost. Developers often skip this and are shocked by bills when processing large datasets.

Cost

A 10,000-row CSV pasted into a prompt costs ~2000 tokens (roughly $0.15). A 1M-row CSV costs ~200,000 tokens (~$15). Use google.generativeai's File API for large datasets instead of embedding them in prompts.

Rate limits

Gemini API defaults to 60 requests/minute for free tier. Data analysis workflows that generate code then re-run it for verification can hit this quickly. Add exponential backoff: catch `google.api_core.exceptions.ResourceExhausted` and retry after 2^attempt seconds.

Common gotcha

Asking Gemini to 'analyze this data' without specifying output format results in explanatory text, not code. Add 'return only Python code' or 'return a JSON summary' to control the response type. Also, pasting large CSVs (>10K rows) into prompts burns tokens fast: for production, use the File API instead.

Error recovery

google.api_core.exceptions.InvalidArgument

Usually means malformed prompt or unsupported characters. Encode data as valid UTF-8 and escape special characters in multiline strings.

google.api_core.exceptions.ResourceExhausted

Rate limited. Wait 60+ seconds before retrying. Implement exponential backoff with jitter.

google.api_core.exceptions.PermissionDenied

API key is invalid or not set. Verify `echo $GOOGLE_API_KEY` in shell; ensure it's a Generative AI key, not Cloud Console auth.

ValueError in generated code

Gemini generated syntactically valid but logically broken code (wrong column names, index errors). Always test generated code in a sandbox before production use.

Experienced dev note

If you're using Gemini for data analysis, separate the analysis request from code generation. Ask Gemini to 'describe what plot would show sales trends' first, then in a follow-up request 'generate the code.' This two-step approach costs slightly more in tokens but produces higher-quality, more predictable code because the model committed to an approach before writing it. Also: always ask for output in a format you can parse: 'return valid JSON' is safer than 'explain the results' because JSON is machine-readable and you can version it.

Check your understanding

You send a 50,000-row CSV to Gemini in a prompt asking it to 'find trends and plot them.' The API charges you based on prompt tokens, not the quality or usefulness of the analysis. How would you restructure this to reduce costs while keeping the same analysis quality?

Show answer hint

Token cost is proportional to data size. Use the File API to cache the dataset (one-time token cost, then reused), or pre-summarize the data locally before sending it to Gemini. Asking Gemini to 'find trends' also uses more reasoning tokens than asking 'plot sales by month': specificity reduces cost and improves output.

VERSION google-generativeai 0.8.x uses LCEL-style method chaining. Older 0.3.x versions used ChatSession; that's deprecated. Ensure you're using `model.generate_content()` not `chat.send_message()`.

Community Notes

No notes yetBe the first to share a version-specific fix or tip.