Code beginner · 3 min read

How to use Gemini for JSON output in python

Direct answer
Use the gemini-1.5-pro model with the OpenAI Python SDK to request JSON output by instructing the model in the prompt and parsing response.choices[0].message.content as JSON.

Setup

Install
bash
pip install openai
Env vars
OPENAI_API_KEY
Imports
python
import os
from openai import OpenAI
import json

Examples

inGenerate JSON with user info: name, age, city
out{"name": "Alice", "age": 30, "city": "Seattle"}
inCreate JSON for a product with id, name, price
out{"id": "P123", "name": "Wireless Mouse", "price": 25.99}
inOutput JSON for a book with title, author, year
out{"title": "1984", "author": "George Orwell", "year": 1949}

Integration steps

  1. Install the OpenAI Python SDK and set your OPENAI_API_KEY in environment variables
  2. Import the OpenAI client and json module
  3. Initialize the OpenAI client with your API key from os.environ
  4. Create a chat completion request using model 'gemini-1.5-pro' with a user message instructing JSON output
  5. Parse the returned text from response.choices[0].message.content as JSON
  6. Use or print the parsed JSON object

Full code

python
import os
from openai import OpenAI
import json

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

prompt = (
    "Generate a JSON object with the following user info fields: name, age, city. "
    "Respond only with the JSON object, no extra text."
)

response = client.chat.completions.create(
    model="gemini-1.5-pro",
    messages=[{"role": "user", "content": prompt}]
)

json_text = response.choices[0].message.content

try:
    data = json.loads(json_text)
    print("Parsed JSON output:", data)
except json.JSONDecodeError:
    print("Failed to parse JSON. Raw output:", json_text)

API trace

Request
json
{"model": "gemini-1.5-pro", "messages": [{"role": "user", "content": "Generate a JSON object with the following user info fields: name, age, city. Respond only with the JSON object, no extra text."}]}
Response
json
{"choices": [{"message": {"content": "{\"name\": \"Alice\", \"age\": 30, \"city\": \"Seattle\"}"}}], "usage": {"total_tokens": 50}}
Extractresponse.choices[0].message.content

Variants

Streaming JSON output

Use streaming when you want to display JSON output progressively for large responses or better UX.

python
import os
from openai import OpenAI
import json

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

prompt = "Generate a JSON object with fields: name, age, city. Respond only with JSON."

response = client.chat.completions.create(
    model="gemini-1.5-pro",
    messages=[{"role": "user", "content": prompt}],
    stream=True
)

json_chunks = []
for chunk in response:
    content = chunk.choices[0].delta.get('content', '')
    print(content, end='')  # Print as it streams
    json_chunks.append(content)

json_text = ''.join(json_chunks)
try:
    data = json.loads(json_text)
    print("\nParsed JSON output:", data)
except json.JSONDecodeError:
    print("\nFailed to parse streamed JSON.")
Async JSON generation

Use async when integrating Gemini calls into asynchronous Python applications for concurrency.

python
import os
import asyncio
from openai import OpenAI
import json

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

async def generate_json():
    prompt = "Generate a JSON object with fields: name, age, city. Respond only with JSON."
    response = await client.chat.completions.acreate(
        model="gemini-1.5-pro",
        messages=[{"role": "user", "content": prompt}]
    )
    json_text = response.choices[0].message.content
    try:
        data = json.loads(json_text)
        print("Parsed JSON output:", data)
    except json.JSONDecodeError:
        print("Failed to parse JSON.")

asyncio.run(generate_json())
Use gemini-2.0-flash for faster JSON output

Use gemini-2.0-flash for lower latency and cost when JSON output speed is critical.

python
import os
from openai import OpenAI
import json

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

prompt = "Generate a JSON object with fields: name, age, city. Respond only with JSON."

response = client.chat.completions.create(
    model="gemini-2.0-flash",
    messages=[{"role": "user", "content": prompt}]
)

json_text = response.choices[0].message.content
try:
    data = json.loads(json_text)
    print("Parsed JSON output:", data)
except json.JSONDecodeError:
    print("Failed to parse JSON.")

Performance

Latency~700ms for gemini-1.5-pro non-streaming
Cost~$0.0015 per 500 tokens
Rate limitsTier 1: 600 RPM / 40K TPM
  • Keep prompts concise to reduce token usage
  • Use streaming for large JSON outputs to start processing early
  • Cache frequent JSON responses to avoid repeated calls
ApproachLatencyCost/callBest for
Standard call with gemini-1.5-pro~700ms~$0.0015Reliable JSON output with good quality
Streaming with gemini-1.5-proStarts immediately, total ~700ms~$0.0015Progressive JSON output for large data
Async call with gemini-1.5-pro~700ms~$0.0015Concurrent JSON generation in async apps
Using gemini-2.0-flash~400ms~$0.0010Faster, cost-effective JSON output

Quick tip

Always instruct Gemini explicitly to respond with JSON only and parse the output with a JSON parser to avoid errors.

Common mistake

Beginners often forget to parse the string output as JSON, treating it as plain text instead.

Verified 2026-04 · gemini-1.5-pro, gemini-2.0-flash
Verify ↗