How to use Gemini for JSON output in python
Direct answer
Use the
gemini-1.5-pro model with the OpenAI Python SDK to request JSON output by instructing the model in the prompt and parsing response.choices[0].message.content as JSON.Setup
Install
pip install openai Env vars
OPENAI_API_KEY Imports
import os
from openai import OpenAI
import json Examples
inGenerate JSON with user info: name, age, city
out{"name": "Alice", "age": 30, "city": "Seattle"}
inCreate JSON for a product with id, name, price
out{"id": "P123", "name": "Wireless Mouse", "price": 25.99}
inOutput JSON for a book with title, author, year
out{"title": "1984", "author": "George Orwell", "year": 1949}
Integration steps
- Install the OpenAI Python SDK and set your OPENAI_API_KEY in environment variables
- Import the OpenAI client and json module
- Initialize the OpenAI client with your API key from os.environ
- Create a chat completion request using model 'gemini-1.5-pro' with a user message instructing JSON output
- Parse the returned text from response.choices[0].message.content as JSON
- Use or print the parsed JSON object
Full code
import os
from openai import OpenAI
import json
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
prompt = (
"Generate a JSON object with the following user info fields: name, age, city. "
"Respond only with the JSON object, no extra text."
)
response = client.chat.completions.create(
model="gemini-1.5-pro",
messages=[{"role": "user", "content": prompt}]
)
json_text = response.choices[0].message.content
try:
data = json.loads(json_text)
print("Parsed JSON output:", data)
except json.JSONDecodeError:
print("Failed to parse JSON. Raw output:", json_text) API trace
Request
{"model": "gemini-1.5-pro", "messages": [{"role": "user", "content": "Generate a JSON object with the following user info fields: name, age, city. Respond only with the JSON object, no extra text."}]} Response
{"choices": [{"message": {"content": "{\"name\": \"Alice\", \"age\": 30, \"city\": \"Seattle\"}"}}], "usage": {"total_tokens": 50}} Extract
response.choices[0].message.contentVariants
Streaming JSON output ›
Use streaming when you want to display JSON output progressively for large responses or better UX.
import os
from openai import OpenAI
import json
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
prompt = "Generate a JSON object with fields: name, age, city. Respond only with JSON."
response = client.chat.completions.create(
model="gemini-1.5-pro",
messages=[{"role": "user", "content": prompt}],
stream=True
)
json_chunks = []
for chunk in response:
content = chunk.choices[0].delta.get('content', '')
print(content, end='') # Print as it streams
json_chunks.append(content)
json_text = ''.join(json_chunks)
try:
data = json.loads(json_text)
print("\nParsed JSON output:", data)
except json.JSONDecodeError:
print("\nFailed to parse streamed JSON.") Async JSON generation ›
Use async when integrating Gemini calls into asynchronous Python applications for concurrency.
import os
import asyncio
from openai import OpenAI
import json
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
async def generate_json():
prompt = "Generate a JSON object with fields: name, age, city. Respond only with JSON."
response = await client.chat.completions.acreate(
model="gemini-1.5-pro",
messages=[{"role": "user", "content": prompt}]
)
json_text = response.choices[0].message.content
try:
data = json.loads(json_text)
print("Parsed JSON output:", data)
except json.JSONDecodeError:
print("Failed to parse JSON.")
asyncio.run(generate_json()) Use gemini-2.0-flash for faster JSON output ›
Use gemini-2.0-flash for lower latency and cost when JSON output speed is critical.
import os
from openai import OpenAI
import json
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
prompt = "Generate a JSON object with fields: name, age, city. Respond only with JSON."
response = client.chat.completions.create(
model="gemini-2.0-flash",
messages=[{"role": "user", "content": prompt}]
)
json_text = response.choices[0].message.content
try:
data = json.loads(json_text)
print("Parsed JSON output:", data)
except json.JSONDecodeError:
print("Failed to parse JSON.") Performance
Latency~700ms for gemini-1.5-pro non-streaming
Cost~$0.0015 per 500 tokens
Rate limitsTier 1: 600 RPM / 40K TPM
- Keep prompts concise to reduce token usage
- Use streaming for large JSON outputs to start processing early
- Cache frequent JSON responses to avoid repeated calls
| Approach | Latency | Cost/call | Best for |
|---|---|---|---|
| Standard call with gemini-1.5-pro | ~700ms | ~$0.0015 | Reliable JSON output with good quality |
| Streaming with gemini-1.5-pro | Starts immediately, total ~700ms | ~$0.0015 | Progressive JSON output for large data |
| Async call with gemini-1.5-pro | ~700ms | ~$0.0015 | Concurrent JSON generation in async apps |
| Using gemini-2.0-flash | ~400ms | ~$0.0010 | Faster, cost-effective JSON output |
Quick tip
Always instruct Gemini explicitly to respond with JSON only and parse the output with a JSON parser to avoid errors.
Common mistake
Beginners often forget to parse the string output as JSON, treating it as plain text instead.
Community Notes
No notes yetBe the first to share a version-specific fix or tip.