RuntimeError
mistral.client.exceptions.RuntimeError: Streaming response incomplete
Stack trace
Traceback (most recent call last):
File "app.py", line 42, in <module>
for chunk in client.chat.completions.create(model="mistral-7b", stream=True):
File "mistral/client/streaming.py", line 88, in __iter__
raise RuntimeError("Streaming response incomplete")
RuntimeError: Streaming response incomplete Why it happens
Mistral's streaming endpoint may terminate early due to network interruptions, server-side timeouts, or client-side mismanagement of the stream iterator. This causes the client to detect an incomplete response and raise a RuntimeError.
Detection
Monitor streaming iterators for premature StopIteration or RuntimeError exceptions and log partial outputs to detect incomplete streams before processing.
Causes & fixes
Network connection dropped during streaming response
Implement robust retry logic with exponential backoff around the streaming call to reconnect and resume or restart the request.
Client code does not fully consume the streaming iterator
Ensure your code fully iterates over the streaming generator until completion or handles early termination gracefully.
Server-side timeout or internal error cuts off the stream
Check server logs and increase timeout settings if possible; also handle incomplete streams by retrying or fallback to non-streaming calls.
Code: broken vs fixed
from mistral import Client
client = Client(api_key="sk-incorrect")
# This will raise RuntimeError if stream ends early
for chunk in client.chat.completions.create(model="mistral-7b", stream=True):
print(chunk.choices[0].message.content) # RuntimeError: Streaming response incomplete import os
from mistral import Client
client = Client(api_key=os.environ["MISTRAL_API_KEY"])
try:
for chunk in client.chat.completions.create(model="mistral-7b", stream=True):
print(chunk.choices[0].message.content)
except RuntimeError as e:
if str(e) == "Streaming response incomplete":
print("Stream ended early, retrying...")
# Retry logic here
else:
raise
# Changed to use environment variable and added retry handling for incomplete streams Workaround
Catch the RuntimeError on incomplete stream, buffer partial output, and fallback to a non-streaming completion call to get the full response.
Prevention
Use robust network handling with retries and timeouts, fully consume streaming iterators, and prefer stable network environments to avoid premature stream termination.