How to beginner · 3 min read

How to use Gemini for document analysis

Q: How to use Gemini for document analysis

Use the gemini-1.5-pro or gemini-2.0-flash model via the Google Gemini API to analyze documents by sending the document text as input in a chat completion request. The API returns structured insights or summaries based on the document content.

Quick answer

Use the gemini-1.5-pro or gemini-2.0-flash model via the Google Gemini API to analyze documents by sending the document text as input in a chat completion request. The API returns structured insights or summaries based on the document content.

PREREQUISITES

Python 3.8+
Google Cloud account with Gemini API access
Set environment variable GOOGLE_API_KEY with your API key
pip install google-ai gemini-sdk (or equivalent Google Gemini client)

Setup

Install the Google Gemini SDK and configure your environment with your API key.

Install the SDK: pip install google-ai gemini-sdk
Set your API key in the environment: export GOOGLE_API_KEY='your_api_key_here' (Linux/macOS) or set GOOGLE_API_KEY=your_api_key_here (Windows)

bash

pip install google-ai gemini-sdk

Step by step

Use the Gemini API to analyze a document by sending its text to the gemini-1.5-pro model and receive a structured summary or insights.

python

import os
from google.ai import gemini_v1

# Initialize client with API key from environment
client = gemini_v1.GeminiClient(api_key=os.environ["GOOGLE_API_KEY"])

document_text = """\
Artificial intelligence (AI) is transforming industries by enabling new capabilities in automation, data analysis, and decision-making. Document analysis helps extract key information efficiently.
"""

response = client.chat.completions.create(
    model="gemini-1.5-pro",
    messages=[{"role": "user", "content": f"Analyze this document:\n{document_text}"}]
)

print("Document analysis result:")
print(response.choices[0].message.content)

output

Document analysis result:
The document highlights AI's impact on industries, focusing on automation, data analysis, and decision-making improvements through document analysis.

Common variations

You can use different Gemini models like gemini-2.0-flash for faster responses or enable streaming for real-time output. Async calls are supported in the SDK for scalable applications.

python

import asyncio
from google.ai import gemini_v1

async def analyze_document_async(text):
    client = gemini_v1.GeminiClient(api_key=os.environ["GOOGLE_API_KEY"])
    response = await client.chat.completions.acreate(
        model="gemini-2.0-flash",
        messages=[{"role": "user", "content": f"Analyze this document:\n{text}"}]
    )
    print("Async analysis result:")
    print(response.choices[0].message.content)

asyncio.run(analyze_document_async("AI is revolutionizing document processing."))

output

Async analysis result:
AI is revolutionizing document processing by enabling faster and more accurate extraction of key information.

Troubleshooting

If you get authentication errors, verify your GOOGLE_API_KEY environment variable is set correctly.
For rate limit errors, implement exponential backoff retries.
If the model returns incomplete analysis, try increasing max_tokens or use a more capable model like gemini-1.5-pro.

✅

Key Takeaways

Use the gemini-1.5-pro model for detailed document analysis.
Always set your API key securely via environment variables.
Async and streaming calls improve performance for large-scale document processing.

Verified 2026-04 · gemini-1.5-pro, gemini-2.0-flash

Verify ↗