How to use Gemini for JSON output in python
Direct answer
Use the
gemini-1.5-pro model with the OpenAI Python SDK to request JSON output by instructing the model in the prompt and parsing response.choices[0].message.content as JSON.Setup
Install
pip install openai Env vars
OPENAI_API_KEY Imports
import os
from openai import OpenAI
import json Examples
inGenerate JSON with user info: name, age, city
out{"name": "Alice", "age": 30, "city": "Seattle"}
inCreate JSON for a product with id, name, price
out{"id": "P123", "name": "Wireless Mouse", "price": 25.99}
inOutput JSON for a book with title, author, year
out{"title": "1984", "author": "George Orwell", "year": 1949}
Integration steps
- Install the OpenAI Python SDK and set your OPENAI_API_KEY in environment variables
- Import the OpenAI client and json module
- Initialize the OpenAI client with your API key from os.environ
- Create a chat completion request using model 'gemini-1.5-pro' with a user message instructing JSON output
- Parse the returned text from response.choices[0].message.content as JSON
- Use or print the parsed JSON object
Full code
import os
from openai import OpenAI
import json
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
prompt = (
"Generate a JSON object with the following user info fields: name, age, city. "
"Respond only with the JSON object, no extra text."
)
response = client.chat.completions.create(
model="gemini-1.5-pro",
messages=[{"role": "user", "content": prompt}]
)
json_text = response.choices[0].message.content
try:
data = json.loads(json_text)
print("Parsed JSON output:", data)
except json.JSONDecodeError:
print("Failed to parse JSON. Raw output:", json_text) API trace
Request
{"model": "gemini-1.5-pro", "messages": [{"role": "user", "content": "Generate a JSON object with the following user info fields: name, age, city. Respond only with the JSON object, no extra text."}]} Response
{"choices": [{"message": {"content": "{\"name\": \"Alice\", \"age\": 30, \"city\": \"Seattle\"}"}}], "usage": {"total_tokens": 50}} Extract
response.choices[0].message.contentVariants
Streaming JSON output ›
Use streaming when you want to display JSON output progressively for large responses or better UX.
import os
from openai import OpenAI
import json
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
prompt = "Generate a JSON object with fields: name, age, city. Respond only with JSON."
response = client.chat.completions.create(
model="gemini-1.5-pro",
messages=[{"role": "user", "content": prompt}],
stream=True
)
json_chunks = []
for chunk in response:
content = chunk.choices[0].delta.get('content', '')
print(content, end='') # Print as it streams
json_chunks.append(content)
json_text = ''.join(json_chunks)
try:
data = json.loads(json_text)
print("\nParsed JSON output:", data)
except json.JSONDecodeError:
print("\nFailed to parse streamed JSON.") Async JSON generation ›
Use async when integrating Gemini calls into asynchronous Python applications for concurrency.
import os
import asyncio
from openai import OpenAI
import json
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
async def generate_json():
prompt = "Generate a JSON object with fields: name, age, city. Respond only with JSON."
response = await client.chat.completions.acreate(
model="gemini-1.5-pro",
messages=[{"role": "user", "content": prompt}]
)
json_text = response.choices[0].message.content
try:
data = json.loads(json_text)
print("Parsed JSON output:", data)
except json.JSONDecodeError:
print("Failed to parse JSON.")
asyncio.run(generate_json()) Use gemini-2.0-flash for faster JSON output ›
Use gemini-2.0-flash for lower latency and cost when JSON output speed is critical.
import os
from openai import OpenAI
import json
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
prompt = "Generate a JSON object with fields: name, age, city. Respond only with JSON."
response = client.chat.completions.create(
model="gemini-2.0-flash",
messages=[{"role": "user", "content": prompt}]
)
json_text = response.choices[0].message.content
try:
data = json.loads(json_text)
print("Parsed JSON output:", data)
except json.JSONDecodeError:
print("Failed to parse JSON.") Performance
Latency~700ms for gemini-1.5-pro non-streaming
Cost~$0.0015 per 500 tokens
Rate limitsTier 1: 600 RPM / 40K TPM
- Keep prompts concise to reduce token usage
- Use streaming for large JSON outputs to start processing early
- Cache frequent JSON responses to avoid repeated calls
| Approach | Latency | Cost/call | Best for |
|---|---|---|---|
| Standard call with gemini-1.5-pro | ~700ms | ~$0.0015 | Reliable JSON output with good quality |
| Streaming with gemini-1.5-pro | Starts immediately, total ~700ms | ~$0.0015 | Progressive JSON output for large data |
| Async call with gemini-1.5-pro | ~700ms | ~$0.0015 | Concurrent JSON generation in async apps |
| Using gemini-2.0-flash | ~400ms | ~$0.0010 | Faster, cost-effective JSON output |
Quick tip
Always instruct Gemini explicitly to respond with JSON only and parse the output with a JSON parser to avoid errors.
Common mistake
Beginners often forget to parse the string output as JSON, treating it as plain text instead.