JSONDecodeError
json.decoder.JSONDecodeError
Stack trace
Traceback (most recent call last):
File "app.py", line 42, in <module>
for output in client.predict(..., stream=True):
File "/usr/local/lib/python3.9/site-packages/replicate/client.py", line 210, in predict
yield json.loads(chunk)
File "/usr/lib/python3.9/json/__init__.py", line 346, in loads
return _default_decoder.decode(s)
File "/usr/lib/python3.9/json/decoder.py", line 337, in decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0) Why it happens
Replicate streams model output in chunks that may not be complete JSON objects, causing json.loads() to fail when called on partial data. This happens because the streaming protocol sends fragmented JSON or non-JSON control messages.
Detection
Monitor for JSONDecodeError exceptions during streaming calls and log raw chunk data to identify incomplete or malformed JSON fragments before parsing.
Causes & fixes
Parsing each streamed chunk as a complete JSON object when chunks are partial or fragmented.
Buffer streamed chunks and concatenate until a full JSON object is received before calling json.loads().
Replicate stream includes non-JSON control messages or empty chunks.
Add checks to skip empty strings or non-JSON lines before parsing, or use a streaming JSON parser that can handle partial data.
Using json.loads() directly on raw stream chunks without handling streaming protocol framing.
Use Replicate SDK's built-in streaming iterator or a JSON streaming parser designed for incremental parsing.
Code: broken vs fixed
import os
import json
from replicate import Client
client = Client(api_token=os.environ["REPLICATE_API_TOKEN"])
# This will raise JSONDecodeError because chunks are partial JSON
for chunk in client.predict(
"owner/model:version",
input={"prompt": "Hello"},
stream=True
):
data = json.loads(chunk) # Error here
print(data) import os
import json
from replicate import Client
client = Client(api_token=os.environ["REPLICATE_API_TOKEN"])
buffer = ""
for chunk in client.predict(
"owner/model:version",
input={"prompt": "Hello"},
stream=True
):
buffer += chunk
try:
data = json.loads(buffer)
print(data)
buffer = "" # Reset buffer after successful parse
except json.JSONDecodeError:
# Wait for more chunks to complete JSON
continue
# Fixed: buffer chunks until valid JSON can be parsed Workaround
Wrap json.loads() in try/except JSONDecodeError, accumulate chunks in a buffer, and parse only when the buffer contains a complete JSON object.
Prevention
Use Replicate SDK's official streaming iterator or a streaming JSON parser that supports incremental parsing to handle partial JSON chunks robustly.