High severity intermediate · Fix: 5-10 min

ValueError: Invalid content format

ValueError (from google.generativeai.types.content_types module)

What this error means

Gemini's generate_content() rejects list-based content input when image and text Parts are in the wrong order or improperly structured, causing validation failure.

Stack trace

traceback

ValueError: Invalid content format
  File "/path/to/site-packages/google/generativeai/types/content_types.py", line X, in _validate_content
    raise ValueError("Invalid content format")
ValueError: Content must be a string, Part, Content, or iterable of Part or string objects. Received: <class 'list'> with invalid structure.

QUICK FIX

Pass a flat list of Part objects with images first, then text: [Part.from_data(data=img_bytes, mime_type='image/jpeg'), 'Your text prompt here']

Why it happens

Gemini's generate_content() expects multimodal content (images + text) to follow a specific order: image Parts must come before or be properly interleaved with text Parts, and all items in a list must be valid Part or string objects. When you pass a list with Parts in the wrong sequence, missing proper Part wrapping, or mixed incompatible types, the API's content validator rejects the input. This is stricter than older vision APIs because Gemini processes images and text in a specific pipeline order.

Detection

Test your multimodal prompts with a simple image+text combination before complex chains. Log the content structure before calling generate_content() to catch ordering issues early. Use type hints to ensure Parts are created correctly.

Causes & fixes

Passing raw image bytes or PIL Image without wrapping in a Part object within a list

✓ Fix

Wrap images in Part objects using Part.from_data() or Part(inline_data=...) before adding to the content list. Example: Part.from_data(data=image_bytes, mime_type='image/jpeg')

Image Part comes after text Part in the list, violating expected content order

✓ Fix

Reorder list items so all image Parts appear before text Parts. Correct order: [image_part, text_part] not [text_part, image_part]

Mixing raw strings with Part objects in a single list without consistent wrapping

✓ Fix

Either use all strings (Gemini auto-wraps them) or convert all to Parts explicitly. Don't mix: use either [text_string, text_string] or [Part(...), Part(...)]

Passing nested lists or dicts instead of flat list of Part/string objects

✓ Fix

Flatten your content structure. Gemini expects: [Part, Part, string] not [[Part, Part], {string}]. Use itertools.chain() or list comprehension to flatten if building from nested sources

Code: broken vs fixed

Broken - triggers the error

python

import google.generativeai as genai
import os
from PIL import Image

genai.configure(api_key=os.environ['GOOGLE_API_KEY'])
model = genai.GenerativeModel('gemini-2.0-flash')

# BROKEN: mixing raw image and text, wrong order
image = Image.open('photo.jpg')
response = model.generate_content([
    'Describe this image',  # Text BEFORE image
    image  # Raw PIL Image, not wrapped in Part
])  # ValueError: Invalid content format

print(response.text)

Fixed - works correctly

python

import google.generativeai as genai
import os
from PIL import Image
from google.generativeai.types import Part

genai.configure(api_key=os.environ['GOOGLE_API_KEY'])
model = genai.GenerativeModel('gemini-2.0-flash')

# FIXED: image Part first, then text; proper Part wrapping
image = Image.open('photo.jpg')

# Convert image to bytes for Part.from_data()
with open('photo.jpg', 'rb') as f:
    image_bytes = f.read()

response = model.generate_content([
    Part.from_data(data=image_bytes, mime_type='image/jpeg'),  # Image Part first
    'Describe this image'  # Text Part second
])

print(response.text)

Wrapped the image in Part.from_data() and reordered the list to put the image Part before the text, matching Gemini's expected multimodal content structure.

⚠

Workaround

If you cannot refactor to use Part objects, pass the image and text as separate sequential calls: first send the image with a simple prompt, capture the response, then use that context in follow-up text-only calls. This avoids list ordering issues at the cost of extra API calls.

✓

Prevention

Always use Part.from_data() or Part(inline_data=genai.types.BlobPropertyName(...)) for images in multimodal lists. Document your content list order: images first, then text. Add a helper function that validates content structure before passing to generate_content(), checking for proper Part wrapping and order. Use type hints: content: list[Part | str] to enforce structure at code review time.

Python 3.9+ · google-generativeai >=0.7.0 · tested on 0.7.2

Verified 2026-04 · gemini-2.0-flash, gemini-1.5-pro

Verify ↗

Community Notes

No notes yetBe the first to share a version-specific fix or tip.