How to Beginner to Intermediate · 3 min read

How to measure agent task completion rate

Quick answer
Measure an agent's task completion rate by defining clear success criteria for each task, logging task outcomes, and calculating the ratio of successful completions to total attempts. Use automated evaluation scripts or human verification to assess task success consistently.

PREREQUISITES

  • Python 3.8+
  • Basic knowledge of AI agents and task automation
  • Logging or tracking system for agent tasks

Setup logging and success criteria

Define what constitutes a successful task completion for your AI agent. This could be a specific output, a state change, or a verified result. Set up a logging mechanism to record each task attempt and its success or failure.

python
import logging

logging.basicConfig(level=logging.INFO, format='%(message)s')

# Example success criteria function for a task

def is_task_successful(output):
    # Define success condition, e.g., output contains expected keyword
    return 'completed' in output.lower()

Step by step measurement code

Run your agent tasks, log results, and calculate the completion rate as the ratio of successful tasks to total tasks.

python
import os

# Simulated agent task outputs
agent_task_outputs = [
    "Task completed successfully",
    "Failed due to timeout",
    "Task completed",
    "Error: invalid input",
    "Completed with warnings"
]

# Function to check success

def is_task_successful(output):
    return 'completed' in output.lower()

# Measure completion rate
successful_tasks = sum(is_task_successful(output) for output in agent_task_outputs)
total_tasks = len(agent_task_outputs)
completion_rate = successful_tasks / total_tasks if total_tasks > 0 else 0

print(f"Task completion rate: {completion_rate:.2%} ({successful_tasks}/{total_tasks})")
output
Task completion rate: 60.00% (3/5)

Common variations

You can extend measurement by integrating human verification for ambiguous cases, using asynchronous task tracking, or employing different success criteria per task type. For large-scale systems, use databases or monitoring tools to aggregate completion metrics.

python
import asyncio

async def run_agent_task_async(task_id):
    # Simulate async task with varied success
    import random
    await asyncio.sleep(0.1)
    outputs = ["Completed", "Failed", "Completed with errors"]
    return random.choice(outputs)

async def measure_async_completion_rate(num_tasks=10):
    results = await asyncio.gather(*(run_agent_task_async(i) for i in range(num_tasks)))
    success_count = sum(is_task_successful(output) for output in results)
    print(f"Async task completion rate: {success_count / num_tasks:.2%} ({success_count}/{num_tasks})")

# To run:
# asyncio.run(measure_async_completion_rate())

Troubleshooting task completion measurement

  • If completion rate is unexpectedly low, verify your success criteria are correctly defined and not too strict.
  • Ensure all task attempts are logged without omission.
  • For inconsistent results, add detailed logging to capture task context and errors.

Key Takeaways

  • Define explicit success criteria for each agent task to measure completion accurately.
  • Log every task attempt and outcome to calculate reliable completion rates.
  • Use automated scripts or human review to validate task success consistently.
  • Async and large-scale task tracking require scalable logging and aggregation.
  • Refine success criteria and logging if completion rates seem inaccurate.
Verified 2026-04
Verify ↗