High severity intermediate · Fix: 5-10 min

JSONDecodeError

json.decoder.JSONDecodeError

What this error means
Replicate's streamed output contains partial or malformed JSON chunks that cause Python's JSON parser to fail during streaming.

Stack trace

traceback
Traceback (most recent call last):
  File "app.py", line 42, in <module>
    for output in client.predict(..., stream=True):
  File "/usr/local/lib/python3.9/site-packages/replicate/client.py", line 210, in predict
    yield json.loads(chunk)
  File "/usr/lib/python3.9/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python3.9/json/decoder.py", line 337, in decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
QUICK FIX
Use Replicate SDK's streaming iterator properly or buffer chunks until a complete JSON object is formed before parsing.

Why it happens

Replicate streams model output in chunks that may not be complete JSON objects, causing json.loads() to fail when called on partial data. This happens because the streaming protocol sends fragmented JSON or non-JSON control messages.

Detection

Monitor for JSONDecodeError exceptions during streaming calls and log raw chunk data to identify incomplete or malformed JSON fragments before parsing.

Causes & fixes

1

Parsing each streamed chunk as a complete JSON object when chunks are partial or fragmented.

✓ Fix

Buffer streamed chunks and concatenate until a full JSON object is received before calling json.loads().

2

Replicate stream includes non-JSON control messages or empty chunks.

✓ Fix

Add checks to skip empty strings or non-JSON lines before parsing, or use a streaming JSON parser that can handle partial data.

3

Using json.loads() directly on raw stream chunks without handling streaming protocol framing.

✓ Fix

Use Replicate SDK's built-in streaming iterator or a JSON streaming parser designed for incremental parsing.

Code: broken vs fixed

Broken - triggers the error
python
import os
import json
from replicate import Client

client = Client(api_token=os.environ["REPLICATE_API_TOKEN"])

# This will raise JSONDecodeError because chunks are partial JSON
for chunk in client.predict(
    "owner/model:version",
    input={"prompt": "Hello"},
    stream=True
):
    data = json.loads(chunk)  # Error here
    print(data)
Fixed - works correctly
python
import os
import json
from replicate import Client

client = Client(api_token=os.environ["REPLICATE_API_TOKEN"])

buffer = ""
for chunk in client.predict(
    "owner/model:version",
    input={"prompt": "Hello"},
    stream=True
):
    buffer += chunk
    try:
        data = json.loads(buffer)
        print(data)
        buffer = ""  # Reset buffer after successful parse
    except json.JSONDecodeError:
        # Wait for more chunks to complete JSON
        continue

# Fixed: buffer chunks until valid JSON can be parsed
Buffered streamed chunks until a complete JSON object is formed before calling json.loads(), preventing JSONDecodeError on partial data.

Workaround

Wrap json.loads() in try/except JSONDecodeError, accumulate chunks in a buffer, and parse only when the buffer contains a complete JSON object.

Prevention

Use Replicate SDK's official streaming iterator or a streaming JSON parser that supports incremental parsing to handle partial JSON chunks robustly.

Python 3.9+ · replicate >=0.9.0 · tested on 0.9.x
Verified 2026-04
Verify ↗

Community Notes

No notes yetBe the first to share a version-specific fix or tip.