ValueError
builtins.ValueError
Stack trace
Traceback (most recent call last):
File "app.py", line 42, in <module>
data = json.loads(streamed_response)
File "/usr/lib/python3.9/json/__init__.py", line 346, in loads
return _default_decoder.decode(s)
File "/usr/lib/python3.9/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python3.9/json/decoder.py", line 355, in raw_decode
raise ValueError("Expecting value", s, err.value) from None
ValueError: Expecting value: line 1 column 1 (char 0) Why it happens
Ollama's streaming API may send partial or chunked JSON data that is not immediately parseable as a complete JSON object. If the code attempts to parse the stream directly without buffering or assembling the full JSON, a ValueError occurs. Additionally, unexpected prefixes, suffixes, or formatting in the stream can cause parsing failures.
Detection
Monitor for ValueError exceptions during JSON parsing of streamed responses and log the raw streamed data to identify incomplete or malformed JSON chunks before processing.
Causes & fixes
Attempting to parse incomplete or partial JSON chunks from the streaming response directly.
Buffer the streamed data fully or parse it incrementally using a streaming JSON parser before calling json.loads.
The streamed response includes non-JSON prefixes, suffixes, or control characters.
Clean the streamed text by stripping out any non-JSON content before parsing, or configure the Ollama client to return raw JSON without extra formatting.
Using synchronous json.loads on an asynchronous stream without awaiting full data.
Accumulate the full response asynchronously before parsing, or use an async-compatible JSON parser.
Code: broken vs fixed
import json
streamed_response = ollama_client.stream("model", prompt="Hello")
data = json.loads(streamed_response) # This line raises ValueError
print(data) import os
import json
import ollama
ollama_client = ollama
# Accumulate streamed response fully before parsing
streamed_chunks = []
for chunk in ollama_client.chat(model="model", messages=[{"role": "user", "content": "Hello"}]):
streamed_chunks.append(chunk['message']['content'])
full_response = ''.join(streamed_chunks)
data = json.loads(full_response) # Fixed: parse complete JSON
print(data) Workaround
Wrap the json.loads call in try/except ValueError, and on failure, log the raw streamed data and retry parsing after accumulating more data or cleaning the input.
Prevention
Use Ollama's official client methods that handle streaming JSON parsing internally or implement a streaming JSON parser to process chunks incrementally, avoiding partial parse errors.