Code beginner · 3 min read

How to authenticate with Gemini API in python

Direct answer
Authenticate with the Gemini API in Python by initializing the Gemini client with your API key stored in os.environ and passing it to the client constructor.

Setup

Install
bash
pip install google-ai-gemini
Env vars
GEMINI_API_KEY
Imports
python
import os
from google.ai import gemini
from google.ai.gemini import GeminiClient

Examples

inSend a simple prompt 'Hello, Gemini!'
outGemini response: Hello, Gemini! How can I assist you today?
inAsk Gemini to summarize a text about AI advancements
outGemini response: AI advancements have accelerated rapidly, enabling new applications in healthcare, finance, and more.
inSend an empty prompt
outGemini response: Please provide a valid input to proceed.

Integration steps

  1. Set your Gemini API key in the environment variable GEMINI_API_KEY
  2. Import the Gemini client and os modules
  3. Initialize the Gemini client with the API key from os.environ
  4. Create a request with the desired model and prompt
  5. Call the Gemini API to get a completion
  6. Extract and use the response text from the API output

Full code

python
import os
from google.ai import gemini
from google.ai.gemini import GeminiClient

def main():
    api_key = os.environ.get("GEMINI_API_KEY")
    if not api_key:
        raise ValueError("GEMINI_API_KEY environment variable not set")

    client = GeminiClient(api_key=api_key)

    request = gemini.CompletionRequest(
        model="gemini-1.5-pro",
        prompt="Hello, Gemini!",
        max_tokens=100
    )

    response = client.complete(request)
    print("Gemini response:", response.choices[0].text)

if __name__ == "__main__":
    main()
output
Gemini response: Hello, Gemini! How can I assist you today?

API trace

Request
json
{"model": "gemini-1.5-pro", "prompt": "Hello, Gemini!", "max_tokens": 100}
Response
json
{"choices": [{"text": "Hello, Gemini! How can I assist you today?"}], "usage": {"total_tokens": 25}}
Extractresponse.choices[0].text

Variants

Streaming response

Use streaming for real-time token-by-token output to improve user experience on long responses.

python
import os
from google.ai import gemini
from google.ai.gemini import GeminiClient

def main():
    api_key = os.environ.get("GEMINI_API_KEY")
    if not api_key:
        raise ValueError("GEMINI_API_KEY environment variable not set")

    client = GeminiClient(api_key=api_key)

    request = gemini.CompletionRequest(
        model="gemini-1.5-pro",
        prompt="Stream this response.",
        max_tokens=100,
        stream=True
    )

    for chunk in client.stream_complete(request):
        print(chunk.text, end='', flush=True)

if __name__ == "__main__":
    main()
Async authentication and completion

Use async calls when integrating Gemini API in applications requiring concurrency or non-blocking IO.

python
import os
import asyncio
from google.ai import gemini
from google.ai.gemini import GeminiClient

async def main():
    api_key = os.environ.get("GEMINI_API_KEY")
    if not api_key:
        raise ValueError("GEMINI_API_KEY environment variable not set")

    client = GeminiClient(api_key=api_key)

    request = gemini.CompletionRequest(
        model="gemini-1.5-pro",
        prompt="Async hello!",
        max_tokens=100
    )

    response = await client.complete_async(request)
    print("Gemini async response:", response.choices[0].text)

if __name__ == "__main__":
    asyncio.run(main())
Use alternative model gemini-2.0-flash

Use the gemini-2.0-flash model for faster, lower-latency completions with slightly reduced context size.

python
import os
from google.ai import gemini
from google.ai.gemini import GeminiClient

def main():
    api_key = os.environ.get("GEMINI_API_KEY")
    if not api_key:
        raise ValueError("GEMINI_API_KEY environment variable not set")

    client = GeminiClient(api_key=api_key)

    request = gemini.CompletionRequest(
        model="gemini-2.0-flash",
        prompt="Use the flash model for faster response.",
        max_tokens=100
    )

    response = client.complete(request)
    print("Gemini flash model response:", response.choices[0].text)

if __name__ == "__main__":
    main()

Performance

Latency~700ms for gemini-1.5-pro non-streaming completion
Cost~$0.003 per 500 tokens for gemini-1.5-pro
Rate limitsTier 1: 600 requests per minute, 40,000 tokens per minute
  • Use concise prompts to reduce token usage
  • Limit max_tokens to only what you need
  • Reuse context when possible to avoid repeated tokens
ApproachLatencyCost/callBest for
Standard completion~700ms~$0.003General purpose completions
Streaming completionStarts immediately, total ~700ms~$0.003Real-time UI updates
Async completion~700ms (non-blocking)~$0.003Concurrent or async apps
gemini-2.0-flash model~400ms~$0.002Low latency, faster responses

Quick tip

Always store your Gemini API key securely in environment variables and never hardcode it in your source code.

Common mistake

A common mistake is forgetting to set the GEMINI_API_KEY environment variable, causing authentication failures.

Verified 2026-04 · gemini-1.5-pro, gemini-2.0-flash
Verify ↗