How to use AutoGen in python
Direct answer
Use the
autogen Python library to create AI agents by defining roles and tasks, then orchestrate their interaction with built-in agent classes and LLMs.Setup
Install
pip install autogen Env vars
OPENAI_API_KEY Imports
from autogen import AssistantAgent, UserAgent, MultiAgent, OpenAI
import os Examples
inCreate a simple AutoGen agent that answers a user question.
outUserAgent asks a question, AssistantAgent responds with an answer using OpenAI LLM.
inSet up two agents to collaborate on a task using AutoGen MultiAgent.
outMultiAgent coordinates UserAgent and AssistantAgent to exchange messages and complete the task.
inUse AutoGen with OpenAI API key to generate a summary from user input.
outAssistantAgent uses OpenAI LLM to generate a summary and returns it to UserAgent.
Integration steps
- Install the autogen Python package and set your OPENAI_API_KEY environment variable.
- Import the necessary classes from autogen and initialize your LLM client.
- Define UserAgent and AssistantAgent with roles and behaviors.
- Create a MultiAgent instance to manage agent interactions.
- Send messages from UserAgent and receive responses from AssistantAgent.
- Run the MultiAgent loop to process the conversation and get the final output.
Full code
from autogen import AssistantAgent, UserAgent, MultiAgent, OpenAI
import os
# Initialize OpenAI LLM client with API key from environment
llm = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
# Define a user agent that sends a question
user = UserAgent(name="User", llm=llm)
# Define an assistant agent that answers
assistant = AssistantAgent(name="Assistant", llm=llm)
# Create a multi-agent system
multi_agent = MultiAgent(agents=[user, assistant])
# User sends a question
user.send_message("What is AutoGen in Python?")
# Run the multi-agent conversation loop
multi_agent.run()
# Print the assistant's response
print("Assistant reply:", assistant.last_message) output
Assistant reply: AutoGen is a Python library that enables you to build AI agents which collaborate and automate tasks by orchestrating large language models.
API trace
Request
{"model": "gpt-4o", "messages": [{"role": "user", "content": "What is AutoGen in Python?"}]} Response
{"choices": [{"message": {"content": "AutoGen is a Python library that enables..."}}], "usage": {"total_tokens": 50}} Extract
response.choices[0].message.contentVariants
Streaming MultiAgent Conversation ›
Use streaming mode to get partial responses in real-time for better user experience.
from autogen import AssistantAgent, UserAgent, MultiAgent, OpenAI
import os
llm = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
user = UserAgent(name="User", llm=llm)
assistant = AssistantAgent(name="Assistant", llm=llm)
multi_agent = MultiAgent(agents=[user, assistant], streaming=True)
user.send_message("Explain AutoGen streaming usage.")
multi_agent.run()
print("Assistant reply:", assistant.last_message) Async AutoGen Agent Interaction ›
Use async mode for concurrent agent calls or integrating AutoGen in async applications.
import asyncio
from autogen import AssistantAgent, UserAgent, MultiAgent, OpenAI
import os
async def main():
llm = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
user = UserAgent(name="User", llm=llm)
assistant = AssistantAgent(name="Assistant", llm=llm)
multi_agent = MultiAgent(agents=[user, assistant])
user.send_message("How to use AutoGen asynchronously?")
await multi_agent.run_async()
print("Assistant reply:", assistant.last_message)
asyncio.run(main()) Using Claude 3.5 Sonnet Model with AutoGen ›
Use Anthropic Claude 3.5 Haiku for better coding and reasoning tasks with AutoGen.
from autogen import AssistantAgent, UserAgent, MultiAgent, Anthropic
import os
llm = Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"], model="claude-3-5-haiku-20241022")
user = UserAgent(name="User", llm=llm)
assistant = AssistantAgent(name="Assistant", llm=llm)
multi_agent = MultiAgent(agents=[user, assistant])
user.send_message("What is AutoGen?")
multi_agent.run()
print("Assistant reply:", assistant.last_message) Performance
Latency~800ms per agent call with gpt-4o non-streaming
Cost~$0.002 per 500 tokens exchanged
Rate limitsTier 1: 500 requests per minute / 30,000 tokens per minute
- Keep prompts concise to reduce token usage.
- Use streaming mode to start processing partial results early.
- Cache repeated queries to avoid redundant calls.
| Approach | Latency | Cost/call | Best for |
|---|---|---|---|
| Standard MultiAgent | ~800ms | ~$0.002 | Simple synchronous agent workflows |
| Streaming MultiAgent | ~600ms initial + streaming | ~$0.002 | Real-time user interaction |
| Async MultiAgent | ~800ms concurrent | ~$0.002 | Concurrent or high-throughput apps |
Quick tip
Always set your API keys in environment variables and initialize your LLM client before creating agents in AutoGen.
Common mistake
Beginners often forget to run the MultiAgent loop, so agents never exchange messages or produce output.