Structured outputs with Ollama
Quick answer
Use
ollama.chat with a prompt instructing the model to output structured data like JSON. Then parse the response["message"]["content"] string in Python to extract the structured output reliably.PREREQUISITES
Python 3.8+pip install ollamaOllama model llama3.2 installed locally
Setup
Install the ollama Python package and ensure you have the llama3.2 model downloaded locally. Ollama runs locally without API keys.
Install with:
pip install ollama Step by step
Call ollama.chat with a prompt that instructs the model to respond in JSON format. Then parse the JSON string from the response content.
import json
import ollama
prompt = '''Generate a JSON object with keys \"name\" and \"age\" only.\nRespond ONLY with the JSON.\n'''
response = ollama.chat(model="llama3.2", messages=[{"role": "user", "content": prompt}])
json_str = response["message"]["content"]
try:
data = json.loads(json_str)
print("Parsed JSON output:", data)
except json.JSONDecodeError:
print("Failed to parse JSON. Raw output:", json_str) output
Parsed JSON output: {'name': 'Alice', 'age': 30} Common variations
- Use other structured formats like XML or CSV by changing the prompt instructions.
- Use different Ollama models like
llama3.3-70bfor larger context or better accuracy. - Integrate with async Python by running
ollama.chatin a thread or async wrapper since Ollama client is synchronous.
Troubleshooting
- If JSON parsing fails, verify the prompt strictly instructs the model to output only JSON without extra text.
- Check that the Ollama daemon is running locally on port 11434.
- Use
print(response)to debug raw model output.
Key Takeaways
- Use explicit prompt instructions to get structured outputs from Ollama models.
- Parse the
response["message"]["content"]string as JSON or your target format in Python. - Ollama runs locally with zero authentication, simplifying integration.
- Test and debug raw outputs to ensure the model follows structured output constraints.