Code beginner · 3 min read

How to use Gemini for JSON output in python

Q: How to use Gemini for JSON output in python

Use the gemini-1.5-pro model with the OpenAI Python SDK to request JSON output by instructing the model in the prompt and parsing response.choices[0].message.content as JSON.

Direct answer

Use the gemini-1.5-pro model with the OpenAI Python SDK to request JSON output by instructing the model in the prompt and parsing response.choices[0].message.content as JSON.

Setup

Install

bash

pip install openai

Env vars

OPENAI_API_KEY

Imports

python

import os
from openai import OpenAI
import json

Examples

inGenerate JSON with user info: name, age, city

out{"name": "Alice", "age": 30, "city": "Seattle"}

inCreate JSON for a product with id, name, price

out{"id": "P123", "name": "Wireless Mouse", "price": 25.99}

inOutput JSON for a book with title, author, year

out{"title": "1984", "author": "George Orwell", "year": 1949}

Integration steps

Install the OpenAI Python SDK and set your OPENAI_API_KEY in environment variables
Import the OpenAI client and json module
Initialize the OpenAI client with your API key from os.environ
Create a chat completion request using model 'gemini-1.5-pro' with a user message instructing JSON output
Parse the returned text from response.choices[0].message.content as JSON
Use or print the parsed JSON object

Full code

python

import os
from openai import OpenAI
import json

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

prompt = (
    "Generate a JSON object with the following user info fields: name, age, city. "
    "Respond only with the JSON object, no extra text."
)

response = client.chat.completions.create(
    model="gemini-1.5-pro",
    messages=[{"role": "user", "content": prompt}]
)

json_text = response.choices[0].message.content

try:
    data = json.loads(json_text)
    print("Parsed JSON output:", data)
except json.JSONDecodeError:
    print("Failed to parse JSON. Raw output:", json_text)

API trace

Request

json

{"model": "gemini-1.5-pro", "messages": [{"role": "user", "content": "Generate a JSON object with the following user info fields: name, age, city. Respond only with the JSON object, no extra text."}]}

Response

json

{"choices": [{"message": {"content": "{\"name\": \"Alice\", \"age\": 30, \"city\": \"Seattle\"}"}}], "usage": {"total_tokens": 50}}

Extractresponse.choices[0].message.content

Variants

Streaming JSON output ›

Use streaming when you want to display JSON output progressively for large responses or better UX.

python

import os
from openai import OpenAI
import json

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

prompt = "Generate a JSON object with fields: name, age, city. Respond only with JSON."

response = client.chat.completions.create(
    model="gemini-1.5-pro",
    messages=[{"role": "user", "content": prompt}],
    stream=True
)

json_chunks = []
for chunk in response:
    content = chunk.choices[0].delta.get('content', '')
    print(content, end='')  # Print as it streams
    json_chunks.append(content)

json_text = ''.join(json_chunks)
try:
    data = json.loads(json_text)
    print("\nParsed JSON output:", data)
except json.JSONDecodeError:
    print("\nFailed to parse streamed JSON.")

Async JSON generation ›

Use async when integrating Gemini calls into asynchronous Python applications for concurrency.

python

import os
import asyncio
from openai import OpenAI
import json

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

async def generate_json():
    prompt = "Generate a JSON object with fields: name, age, city. Respond only with JSON."
    response = await client.chat.completions.acreate(
        model="gemini-1.5-pro",
        messages=[{"role": "user", "content": prompt}]
    )
    json_text = response.choices[0].message.content
    try:
        data = json.loads(json_text)
        print("Parsed JSON output:", data)
    except json.JSONDecodeError:
        print("Failed to parse JSON.")

asyncio.run(generate_json())

Use gemini-2.0-flash for faster JSON output ›

Use gemini-2.0-flash for lower latency and cost when JSON output speed is critical.

python

import os
from openai import OpenAI
import json

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

prompt = "Generate a JSON object with fields: name, age, city. Respond only with JSON."

response = client.chat.completions.create(
    model="gemini-2.0-flash",
    messages=[{"role": "user", "content": prompt}]
)

json_text = response.choices[0].message.content
try:
    data = json.loads(json_text)
    print("Parsed JSON output:", data)
except json.JSONDecodeError:
    print("Failed to parse JSON.")

Performance

Latency~700ms for gemini-1.5-pro non-streaming

Cost~$0.0015 per 500 tokens

Rate limitsTier 1: 600 RPM / 40K TPM

Keep prompts concise to reduce token usage
Use streaming for large JSON outputs to start processing early
Cache frequent JSON responses to avoid repeated calls

Approach	Latency	Cost/call	Best for
Standard call with gemini-1.5-pro	~700ms	~$0.0015	Reliable JSON output with good quality
Streaming with gemini-1.5-pro	Starts immediately, total ~700ms	~$0.0015	Progressive JSON output for large data
Async call with gemini-1.5-pro	~700ms	~$0.0015	Concurrent JSON generation in async apps
Using gemini-2.0-flash	~400ms	~$0.0010	Faster, cost-effective JSON output

✓

Quick tip

Always instruct Gemini explicitly to respond with JSON only and parse the output with a JSON parser to avoid errors.

⚠

Common mistake

Beginners often forget to parse the string output as JSON, treating it as plain text instead.

Verified 2026-04 · gemini-1.5-pro, gemini-2.0-flash

Verify ↗