GroqContextLengthExceededError
groq.client.errors.GroqContextLengthExceededError
Stack trace
groq.client.errors.GroqContextLengthExceededError: Input tokens exceed the maximum context length allowed by the Groq model (e.g., 8192 tokens).
Why it happens
Groq models have a fixed maximum context length that limits the number of tokens in the input prompt plus completion. When the combined token count exceeds this limit, the Groq client raises this error to prevent invalid requests. This often happens when sending very long prompts or concatenating multiple documents without truncation.
Detection
Monitor token counts before sending requests by using Groq's tokenizer utilities or logging input lengths; catch GroqContextLengthExceededError exceptions to identify when inputs are too large.
Causes & fixes
Input prompt plus completion length exceeds Groq model's max token limit (e.g., 8192 tokens).
Truncate or summarize input prompts to fit within the model's maximum context length before sending the request.
Concatenating multiple documents or data sources without checking combined token count.
Implement token counting and chunking logic to split inputs into smaller segments that fit the context window.
Using a Groq model with a smaller context length than expected for your use case.
Switch to a Groq model variant with a larger context window if available, or reduce input size accordingly.
Code: broken vs fixed
from groq import GroqClient
client = GroqClient(api_key=os.environ['GROQ_API_KEY'])
long_prompt = """Very long text exceeding model context length..."""
response = client.generate(model="groq-1", prompt=long_prompt) # Raises GroqContextLengthExceededError import os
from groq import GroqClient
client = GroqClient(api_key=os.environ['GROQ_API_KEY'])
long_prompt = """Very long text exceeding model context length..."""
# Truncate prompt to fit model max tokens
max_tokens = 8192
truncated_prompt = long_prompt[:max_tokens * 4] # Approximate truncation by characters
response = client.generate(model="groq-1", prompt=truncated_prompt) # Fixed: input fits context length
print(response) Workaround
Catch GroqContextLengthExceededError and implement fallback logic to split the input into smaller chunks, then process each chunk separately.
Prevention
Use Groq's tokenizer utilities to count tokens before requests and design your application to chunk or summarize inputs to always stay within the model's context length limit.