High severity intermediate · Fix: 5-15 min

SAFETY

google.generativeai.types.HarmBlockedError or finish_reason=SAFETY

What this error means

Gemini's safety filter blocked your API response because the generated content violated Google's safety policies (hate speech, violence, sexual content, dangerous instructions, or harassment).

Stack trace

traceback

google.generativeai.types.HarmBlockedError: The response was blocked by the safety filter.

Response finish_reason: SAFETY
Blocked reason categories: [HarmCategory.HARM_CATEGORY_HATE_SPEECH, HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT]

At generation.py:345 in generate_content
  response = self.generate_content(
            ^

QUICK FIX

Add safety_settings to your generate_content() call, setting each HarmCategory threshold to BLOCK_ONLY_HIGH, and reword your prompt to avoid explicit requests for harmful content.

Why it happens

Google's safety classifiers analyze both user prompts and model-generated text against four harm categories: hate speech, dangerous content, sexual content, and harassment. When the model's response contains text flagged as high-probability harmful, the safety filter blocks it and returns finish_reason=SAFETY instead of the generated content. This happens even if your input prompt was benign: the model's output itself triggered the filter. The threshold for blocking depends on the safety settings you configure; stricter settings (BLOCK_MOST_LOW) block more content, while permissive settings (BLOCK_NONE) allow more.

Detection

Always check the finish_reason field in the response object. If finish_reason equals 'SAFETY', the content was blocked. Log the blocked text and harm categories to identify patterns: are your prompts asking for violent content, explicit instructions for harm, or other policy violations? Monitor for repeated blocks: they indicate systematic issues with your prompt design or model selection.

Causes & fixes

Prompt explicitly asks for harmful content (instructions for violence, illegal activities, explicit sexual content, or hate speech)

✓ Fix

Reframe your prompt to request the information in a safety-compliant way. Instead of 'write instructions for making explosives', ask 'explain the chemistry of combustion reactions in academic terms'. Use neutral, educational framing.

Safety settings are too strict (BLOCK_MOST_LOW) for your use case, blocking legitimate content

✓ Fix

Lower the safety threshold by setting safety_settings with HarmBlockThreshold.BLOCK_ONLY_HIGH or BLOCK_MEDIUM_AND_ABOVE. Use this for content moderation, creative writing, or discussing sensitive topics academically.

Model is generating toxic follow-up text on its own due to prompt context or topic sensitivity

✓ Fix

Add explicit safety instructions in your system prompt: 'Generate helpful, respectful content. Avoid hate speech, violence, explicit content, and harmful instructions.' Use a model with better safety alignment like gemini-2.0-flash.

Using a model that's overly sensitive or misclassifying benign content due to keyword triggers

✓ Fix

Switch to gemini-2.0-flash which has improved safety accuracy. If blocked content is truly benign, set safety_settings to BLOCK_ONLY_HIGH for the specific harm category causing issues.

Code: broken vs fixed

Broken - triggers the error

python

import os
import google.generativeai as genai

genai.configure(api_key=os.environ['GOOGLE_API_KEY'])
model = genai.GenerativeModel('gemini-2.0-flash')

# This prompt is too explicit and triggers the safety filter
prompt = "Write detailed instructions on how to make a molotov cocktail"

try:
    response = model.generate_content(prompt)  # ← BLOCKED by safety filter
    print(response.text)
except Exception as e:
    print(f"Error: {e}")
    # Error message: 'The response was blocked by the safety filter.'
    # finish_reason='SAFETY'

Fixed - works correctly

python

import os
import google.generativeai as genai
from google.generativeai.types import HarmCategory, HarmBlockThreshold

genai.configure(api_key=os.environ['GOOGLE_API_KEY'])
model = genai.GenerativeModel('gemini-2.0-flash')

# FIXED: Reframe prompt educationally and set appropriate safety thresholds
prompt = "Explain the historical context and chemistry of combustion reactions, focusing on their legitimate industrial and scientific applications."

# Set safety settings to allow legitimate educational content
safety_settings = [
    {
        "category": HarmCategory.HARM_CATEGORY_HATE_SPEECH,
        "threshold": HarmBlockThreshold.BLOCK_ONLY_HIGH
    },
    {
        "category": HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT,
        "threshold": HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE
    },
    {
        "category": HarmCategory.HARM_CATEGORY_SEXUAL_CONTENT,
        "threshold": HarmBlockThreshold.BLOCK_ONLY_HIGH
    },
    {
        "category": HarmCategory.HARM_CATEGORY_HARASSMENT,
        "threshold": HarmBlockThreshold.BLOCK_ONLY_HIGH
    }
]

try:
    response = model.generate_content(
        prompt,
        safety_settings=safety_settings  # ← ADDED: Configured thresholds
    )
    print(f"Response: {response.text}")
    print(f"Finish reason: {response.candidates[0].finish_reason}")
except Exception as e:
    print(f"Error: {e}")
    print(f"Finish reason: {getattr(response.candidates[0], 'finish_reason', 'UNKNOWN')}")

Changed the prompt to use educational framing and added explicit safety_settings with BLOCK_ONLY_HIGH thresholds for most categories, allowing legitimate content while still blocking genuinely harmful material.

⚠

Workaround

If you cannot modify your prompt or safety settings, wrap the generate_content() call in a try/except block that catches the HarmBlockedError, then request a reformulated response with explicit instructions to avoid flagged categories. Log the blocked response and use human review as a fallback before returning an error to the user. Example: catch the exception, log the raw finish_reason, and return a user-friendly message like 'Your request involves sensitive content. Please rephrase and try again.'

✓

Prevention

Design prompts with safety in mind from the start: use educational framing, avoid explicit requests for harmful content, and test with gemini-2.0-flash which has improved safety accuracy. Set safety_settings based on your use case: use BLOCK_ONLY_HIGH for content moderation and BLOCK_MOST_LOW for sensitive but legitimate topics. Implement monitoring to track finish_reason values; if SAFETY blocks exceed 5% of requests, audit your prompt design. Consider using Anthropic's Claude for sensitive applications where you need finer-grained safety control.

Python 3.9+ · google-generativeai >=0.3.0 · tested on 0.7.0

Verified 2026-04 · gemini-2.0-flash, gemini-1.5-pro

Verify ↗

Community Notes

No notes yetBe the first to share a version-specific fix or tip.