How to Intermediate · 3 min read

How to send images to Gemini API

Quick answer
To send images to the Gemini API, encode the image as a base64 string and include it in the messages payload under the image_url or image_base64 field depending on the API version. Use the vertexai Python SDK or direct HTTP requests with the appropriate model like gemini-2.0-flash for multimodal input.

PREREQUISITES

  • Python 3.8+
  • Google Cloud project with Vertex AI enabled
  • Google Cloud authentication setup (Application Default Credentials)
  • pip install vertexai

Setup

Install the vertexai SDK and authenticate with Google Cloud using Application Default Credentials. Set your Google Cloud project and location before calling the Gemini model.

bash
pip install vertexai

Step by step

This example shows how to send an image file to the Gemini API using the vertexai Python SDK. The image is read, base64-encoded, and sent as part of the message content for multimodal processing.

python
import base64
import vertexai
from vertexai.generative_models import GenerativeModel

# Initialize Vertex AI
vertexai.init(project="your-gcp-project", location="us-central1")

# Load Gemini model
model = GenerativeModel("gemini-2.0-flash")

# Read and encode image
with open("image.jpg", "rb") as img_file:
    img_bytes = img_file.read()
    img_b64 = base64.b64encode(img_bytes).decode("utf-8")

# Prepare multimodal message
messages = [
    {
        "content": {
            "text": "Describe this image",
            "image_base64": img_b64
        },
        "type": "image"
    }
]

# Generate response
response = model.generate_content(messages=messages)
print(response.text)
output
A detailed description of the image content printed here.

Common variations

  • Use image_url instead of image_base64 to send an image by URL.
  • Use async calls with await model.generate_content(...) in an async function.
  • Switch to other Gemini models like gemini-2.5-pro for higher accuracy.

Troubleshooting

  • If you get authentication errors, ensure your Google Cloud credentials are set with gcloud auth application-default login.
  • For large images, check API size limits and consider resizing before encoding.
  • If the model does not recognize the image field, verify you are using the latest vertexai SDK and model version.

Key Takeaways

  • Encode images as base64 strings to send them in Gemini API messages.
  • Use the official vertexai SDK with proper Google Cloud authentication.
  • You can send images either as base64 or via URLs depending on your use case.
  • Always check model documentation for supported multimodal input formats.
  • Handle authentication and image size limits to avoid common errors.
Verified 2026-04 · gemini-2.0-flash, gemini-2.5-pro
Verify ↗