model.start_chat(): conversation object
Why this matters
Real applications need conversational memory: a chatbot, debugging assistant, or code review tool can't restart from scratch for every user message. <code>start_chat()</code> handles context and turn order automatically, eliminating the manual work of appending/managing message history.
Explanation
What it does: model.start_chat() returns a ChatSession object that maintains conversation history and role tracking (user/model). Each call to send_message() automatically appends your message and the model's response to an internal history list, so the next turn sees the full context.
How it works: The chat object stores a list of Content objects representing the conversation thread. When you call send_message(prompt), the SDK packages your message and the entire history into a single API request to Gemini. The model sees the conversation arc, not isolated prompts. Response messages are automatically added to history for the next turn.
When to use it: Use this for any interactive experience: user-facing chatbots, multi-step problem solving, iterative refinement workflows, or debugging conversations where context from earlier turns directly influences the next response. It's also the idiomatic way to build conversation in Gemini unlike single-call APIs where you'd manually manage message lists.
Request code
import google.generativeai as genai
import os
genai.configure(api_key=os.environ['GOOGLE_API_KEY'])
model = genai.GenerativeModel('gemini-2.0-flash')
chat = model.start_chat(history=[])
response1 = chat.send_message('Explain quantum entanglement in one sentence')
print('Assistant:', response1.text)
print('Message count:', len(chat.history))
response2 = chat.send_message('Now explain it to a five-year-old')
print('Assistant:', response2.text)
print('Message count:', len(chat.history))
response3 = chat.send_message('What was the first thing I asked you?')
print('Assistant:', response3.text) Authentication
Ensure your API key is set before instantiating the model. The SDK reads GOOGLE_API_KEY from environment or accepts it via genai.configure(api_key='your-key'). The conversation object inherits this authentication: no additional setup per chat.
Response shape
| Field | Description |
|---|---|
text | The assistant's response as a plain string |
parts | List of Content.Part objects (usually contains one text part) |
finish_reason | Enum indicating why generation stopped (STOP, MAX_TOKENS, etc.) |
usage_metadata | Object with prompt_token_count, candidates_token_count, total_token_count |
Field guide
text The primary field you'll use 99% of the time: the actual model response as a string
parts Lower-level access to response components; useful if you need to inspect token counts or check for specific content types before accessing .text
finish_reason Often overlooked but critical in production: <code>STOP</code> means normal completion, but <code>MAX_TOKENS</code> means the response was truncated: your answer is incomplete and should be handled differently
usage_metadata Developers commonly ignore this, but it's the only way to track token consumption per turn without external logging: essential for cost monitoring in high-volume systems
Setup trap
Passing a non-empty history list to start_chat() is a subtle footgun. The history must alternate perfectly: user, model, user, model. If you pass history with consecutive user messages or incorrect role tags, the API silently accepts it but treats the conversation as malformed on the next send_message(): producing errors that blame your prompt, not the history structure.
Cost
Each <code>send_message()</code> includes the <em>entire conversation history</em> in the request, not just your new message. A 10-turn conversation where each user message is 100 tokens and each response is 200 tokens will cost you tokens for all prior messages again on turn 11. For long conversations, consider summarizing history or using separate chat sessions to avoid token explosion.
Rate limits
Rapid <code>send_message()</code> calls (e.g., loop of 10 messages in 2 seconds) will hit rate limits before single-call APIs. The Gemini API allows ~10 requests per minute for free tier. Long conversations with human delays won't trigger this, but automated multi-turn workflows need exponential backoff.
Common gotcha
Modifying the chat.history list directly after send_message() feels intuitive but breaks turn order. Each new send_message() expects history in the exact structure the API maintains. If you append, remove, or reorder messages manually, the next API call may fail with a 'malformed conversation' error or produce incoherent responses because turn roles are scrambled.
Error recovery
InvalidArgument: INVALID_ARGUMENTResourceExhaustedDeadlineExceededUnauthenticatedExperienced dev note
The ChatSession object's automatic history management is a gift and a trap. Gift: you never manually append/format messages. Trap: the history is mutable and kept in memory: long-running applications leak memory if you spawn unbounded chats. In production, either bound chat lifetime (max 50 turns per chat, then restart) or externalize history to a database and rebuild the ChatSession periodically. Also, history is not persisted across process restarts: if you need conversation recovery, export chat.history to JSON before shutting down, then rebuild via model.start_chat(history=loaded_history).
Check your understanding
You're building a customer support chatbot. A user sends 5 messages over 30 minutes, then your process crashes and restarts. The next message from the user produces a response that ignores context from their first 3 messages. Why, and what's the fix?
Show answer hint
The crash destroyed the in-memory ChatSession object. History is not persisted by the API itself: you must save it before shutdown and reload it into a new ChatSession. The Gemini API has no server-side session storage like traditional chatbots.
ChatSession.send_message(). Earlier 0.1.x versions used deprecated ChatSession.send_messages() (plural): upgrade to avoid maintenance debt. The history structure and role enumeration are stable across 0.8.x.