GroqStreamingIncompleteResponseError
groq.client.errors.GroqStreamingIncompleteResponseError
Stack trace
groq.client.errors.GroqStreamingIncompleteResponseError: Streaming response ended prematurely without completion token
File "/app/main.py", line 42, in generate_response
response = client.chat.completions.create(...)
File "/usr/local/lib/python3.10/site-packages/groq/client.py", line 210, in create
raise GroqStreamingIncompleteResponseError("Streaming response incomplete") Why it happens
Groq's streaming endpoint expects a complete response terminated by a specific end-of-stream token. Network interruptions, server-side timeouts, or client-side premature cancellations can cause the stream to end before this token is received, triggering this error.
Detection
Monitor streaming responses for abrupt termination without the expected end-of-stream token and log partial outputs to detect incomplete streams before raising errors.
Causes & fixes
Network instability or timeout causing the stream to cut off early
Implement retry logic with exponential backoff on streaming failures and ensure stable network connectivity.
Client-side code prematurely closing the stream or cancelling the request
Review client code to avoid cancelling or closing the stream before the full response is received.
Server-side Groq model hitting internal timeout or resource limits
Contact Groq support to check server logs and increase timeout or resource allocation if needed.
Code: broken vs fixed
from groq import GroqClient
client = GroqClient(api_key="mykey")
response = client.chat.completions.create(model="groq-llm", messages=[{"role": "user", "content": "Hello"}], stream=True) # This line triggers GroqStreamingIncompleteResponseError import os
from groq import GroqClient, GroqStreamingIncompleteResponseError
client = GroqClient(api_key=os.environ["GROQ_API_KEY"])
try:
response = client.chat.completions.create(model="groq-llm", messages=[{"role": "user", "content": "Hello"}], stream=True)
except GroqStreamingIncompleteResponseError:
# Retry once on incomplete stream
response = client.chat.completions.create(model="groq-llm", messages=[{"role": "user", "content": "Hello"}], stream=True)
print(response) Workaround
Wrap the streaming call in try/except GroqStreamingIncompleteResponseError, then buffer partial output and retry the request to recover from incomplete streams.
Prevention
Use robust network infrastructure, implement client-side retries with backoff, and monitor Groq server health to avoid premature stream termination.