How to test AI agents
Quick answer
To test AI agents, use Python unit tests to verify agent logic and mock AI API calls with libraries like unittest.mock. For integration tests, call the AI APIs (e.g., OpenAI or Anthropic) with test prompts and validate responses programmatically.
PREREQUISITES
Python 3.8+OpenAI API key (free tier works)pip install openai>=1.0
Setup
Install the openai Python SDK and set your API key as an environment variable for secure authentication.
- Install SDK:
pip install openai - Set API key in your shell:
export OPENAI_API_KEY='your_api_key'
pip install openai output
Collecting openai Downloading openai-1.x.x-py3-none-any.whl (xx kB) Installing collected packages: openai Successfully installed openai-1.x.x
Step by step
This example shows a simple unit test for an AI agent function that calls the OpenAI chat completion API. It mocks the API response to test agent logic without real API calls.
import os
import unittest
from unittest.mock import patch
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
def ai_agent(prompt: str) -> str:
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": prompt}]
)
return response.choices[0].message.content
class TestAIAgent(unittest.TestCase):
@patch("openai.OpenAI.chat.completions.create")
def test_ai_agent_response(self, mock_create):
mock_create.return_value = type("obj", (), {
"choices": [{"message": {"content": "Hello from mock!"}}]
})()
result = ai_agent("Say hello")
self.assertEqual(result, "Hello from mock!")
if __name__ == "__main__":
unittest.main() output
... ---------------------------------------------------------------------- Ran 1 test in 0.001s OK
Common variations
You can test AI agents asynchronously using asyncio and the async OpenAI client. For integration tests, call the real API with test prompts and assert expected response patterns. Use different models like gpt-4o-mini or claude-3-5-sonnet-20241022 depending on your provider.
import os
import asyncio
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
async def async_ai_agent(prompt: str) -> str:
response = await client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": prompt}]
)
return response.choices[0].message.content
async def main():
reply = await async_ai_agent("Hello async world")
print(reply)
if __name__ == "__main__":
asyncio.run(main()) output
Hello async world! (or actual model response)
Troubleshooting
- If you get authentication errors, verify your
OPENAI_API_KEYenvironment variable is set correctly. - For network timeouts, check your internet connection and retry with exponential backoff.
- If mocked tests fail, ensure the mock path matches the import path of the
OpenAIclient in your code.
Key Takeaways
- Use mocking to isolate AI agent logic from external API calls during unit tests.
- Run integration tests with real API calls and validate response content programmatically.
- Support async testing for agents using async SDK methods and asyncio.
- Always secure API keys via environment variables to avoid leaks.
- Match mock patch paths exactly to your client import locations to avoid silent failures.