How to beginner to intermediate · 3 min read

How to test AI agents

Quick answer
To test AI agents, use Python unit tests to verify agent logic and mock AI API calls with libraries like unittest.mock. For integration tests, call the AI APIs (e.g., OpenAI or Anthropic) with test prompts and validate responses programmatically.

PREREQUISITES

  • Python 3.8+
  • OpenAI API key (free tier works)
  • pip install openai>=1.0

Setup

Install the openai Python SDK and set your API key as an environment variable for secure authentication.

  • Install SDK: pip install openai
  • Set API key in your shell: export OPENAI_API_KEY='your_api_key'
bash
pip install openai
output
Collecting openai
  Downloading openai-1.x.x-py3-none-any.whl (xx kB)
Installing collected packages: openai
Successfully installed openai-1.x.x

Step by step

This example shows a simple unit test for an AI agent function that calls the OpenAI chat completion API. It mocks the API response to test agent logic without real API calls.

python
import os
import unittest
from unittest.mock import patch
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

def ai_agent(prompt: str) -> str:
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": prompt}]
    )
    return response.choices[0].message.content

class TestAIAgent(unittest.TestCase):
    @patch("openai.OpenAI.chat.completions.create")
    def test_ai_agent_response(self, mock_create):
        mock_create.return_value = type("obj", (), {
            "choices": [{"message": {"content": "Hello from mock!"}}]
        })()
        result = ai_agent("Say hello")
        self.assertEqual(result, "Hello from mock!")

if __name__ == "__main__":
    unittest.main()
output
...
----------------------------------------------------------------------
Ran 1 test in 0.001s

OK

Common variations

You can test AI agents asynchronously using asyncio and the async OpenAI client. For integration tests, call the real API with test prompts and assert expected response patterns. Use different models like gpt-4o-mini or claude-3-5-sonnet-20241022 depending on your provider.

python
import os
import asyncio
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

async def async_ai_agent(prompt: str) -> str:
    response = await client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": prompt}]
    )
    return response.choices[0].message.content

async def main():
    reply = await async_ai_agent("Hello async world")
    print(reply)

if __name__ == "__main__":
    asyncio.run(main())
output
Hello async world! (or actual model response)

Troubleshooting

  • If you get authentication errors, verify your OPENAI_API_KEY environment variable is set correctly.
  • For network timeouts, check your internet connection and retry with exponential backoff.
  • If mocked tests fail, ensure the mock path matches the import path of the OpenAI client in your code.

Key Takeaways

  • Use mocking to isolate AI agent logic from external API calls during unit tests.
  • Run integration tests with real API calls and validate response content programmatically.
  • Support async testing for agents using async SDK methods and asyncio.
  • Always secure API keys via environment variables to avoid leaks.
  • Match mock patch paths exactly to your client import locations to avoid silent failures.
Verified 2026-04 · gpt-4o-mini, claude-3-5-sonnet-20241022
Verify ↗