How to beginner · 3 min read

Structured outputs with Ollama

Quick answer
Use ollama.chat with a prompt instructing the model to output structured data like JSON. Then parse the response["message"]["content"] string in Python to extract the structured output reliably.

PREREQUISITES

  • Python 3.8+
  • pip install ollama
  • Ollama model llama3.2 installed locally

Setup

Install the ollama Python package and ensure you have the llama3.2 model downloaded locally. Ollama runs locally without API keys.

Install with:

bash
pip install ollama

Step by step

Call ollama.chat with a prompt that instructs the model to respond in JSON format. Then parse the JSON string from the response content.

python
import json
import ollama

prompt = '''Generate a JSON object with keys \"name\" and \"age\" only.\nRespond ONLY with the JSON.\n'''

response = ollama.chat(model="llama3.2", messages=[{"role": "user", "content": prompt}])

json_str = response["message"]["content"]

try:
    data = json.loads(json_str)
    print("Parsed JSON output:", data)
except json.JSONDecodeError:
    print("Failed to parse JSON. Raw output:", json_str)
output
Parsed JSON output: {'name': 'Alice', 'age': 30}

Common variations

  • Use other structured formats like XML or CSV by changing the prompt instructions.
  • Use different Ollama models like llama3.3-70b for larger context or better accuracy.
  • Integrate with async Python by running ollama.chat in a thread or async wrapper since Ollama client is synchronous.

Troubleshooting

  • If JSON parsing fails, verify the prompt strictly instructs the model to output only JSON without extra text.
  • Check that the Ollama daemon is running locally on port 11434.
  • Use print(response) to debug raw model output.

Key Takeaways

  • Use explicit prompt instructions to get structured outputs from Ollama models.
  • Parse the response["message"]["content"] string as JSON or your target format in Python.
  • Ollama runs locally with zero authentication, simplifying integration.
  • Test and debug raw outputs to ensure the model follows structured output constraints.
Verified 2026-04 · llama3.2, llama3.3-70b
Verify ↗