StopIteration or AttributeError on iterate_chunks()
google.generativeai.types.stream.AsyncGenerateContentResponse.iterate_chunks() / GenerateContentResponse.iterate_chunks()
Stack trace
Traceback (most recent call last):
File "app.py", line 42, in <module>
for chunk in response.iterate_chunks():
File "google/generativeai/types/stream.py", line 156, in iterate_chunks
raise StopIteration
StopIteration
OR
AttributeError: 'GenerateContentResponse' object has no attribute 'iterate_chunks'
File "app.py", line 42, in <module>
for chunk in response.iterate_chunks(): Why it happens
The iterate_chunks() method is only available on streaming responses created with stream=True in generate_content(). When you call generate_content() without stream=True, you get a static GenerateContentResponse object that doesn't have the iterate_chunks() method. Additionally, if the stream was not properly awaited (in async context) or if you try to iterate twice on the same exhausted stream, StopIteration is raised. The method also fails if the response object is None or the connection drops before chunks arrive.
Detection
Check the return type before calling iterate_chunks(): use isinstance(response, google.generativeai.types.stream.GenerateContentResponse) and verify stream=True was passed to generate_content(). Log the response object type before iteration to catch AttributeError early.
Causes & fixes
Called iterate_chunks() on a non-streaming response (generate_content() without stream=True)
Add stream=True parameter to generate_content(): response = model.generate_content(prompt, stream=True)
Tried to iterate the same stream twice or after it was already exhausted
Create a fresh streaming response for each iteration: collect chunks in first loop, then reuse the list. Do NOT iterate the same stream object twice.
Using async API without proper await, or mixing sync/async calls incorrectly
For async, use: async with response: async for chunk in response: ... OR for sync use the blocking API without await. Never mix sync generate_content() with async iteration.
Response object is None due to API error, missing API key, or network failure before stream initialized
Wrap generate_content() in try/except, check if response is not None, and verify GOOGLE_API_KEY is set via os.environ or genai.configure()
Code: broken vs fixed
import google.generativeai as genai
import os
genai.configure(api_key=os.environ['GOOGLE_API_KEY'])
model = genai.GenerativeModel('gemini-2.0-flash')
prompt = 'Explain quantum computing in 500 words.'
response = model.generate_content(prompt) # ❌ Missing stream=True
# This line will fail with AttributeError
for chunk in response.iterate_chunks():
print(chunk.text) import google.generativeai as genai
import os
genai.configure(api_key=os.environ['GOOGLE_API_KEY'])
model = genai.GenerativeModel('gemini-2.0-flash')
prompt = 'Explain quantum computing in 500 words.'
response = model.generate_content(prompt, stream=True) # ✅ Added stream=True
# Now iterate_chunks() is available
for chunk in response.iterate_chunks():
if chunk.text:
print(chunk.text, end='', flush=True)
print('\nStream completed successfully.') Workaround
If you cannot add stream=True immediately (legacy code constraints), collect the response text and manually split it into chunks: response = model.generate_content(prompt); chunks = response.text.split(' '); process each chunk. This loses real-time streaming benefit but allows chunk-by-chunk processing.
Prevention
Always enable streaming at the API call site if you need chunk-level access: use stream=True by default for LLM responses you'll process incrementally. Create a helper function that enforces streaming: def stream_generate(model, prompt): return model.generate_content(prompt, stream=True). This prevents non-streaming calls from sneaking into production. Test stream iteration explicitly in unit tests to catch iterate_chunks() failures before deployment.