Human in the loop AI workflows explained
Quick answer
Human in the loop (HITL) AI workflows combine automated AI processing with human review to improve accuracy and reliability. Use AI APIs like OpenAI or Anthropic to generate outputs, then route uncertain or critical cases to human reviewers for validation or correction.
PREREQUISITES
Python 3.8+OpenAI API key (free tier works)pip install openai>=1.0
Setup
Install the openai Python SDK and set your API key as an environment variable.
- Run
pip install openai - Set
OPENAI_API_KEYin your environment
pip install openai output
Collecting openai Downloading openai-1.x.x-py3-none-any.whl (xx kB) Installing collected packages: openai Successfully installed openai-1.x.x
Step by step
This example shows a simple HITL workflow where AI generates a response and uncertain outputs are flagged for human review.
import os
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
# Function to get AI response
def ai_generate(prompt: str) -> str:
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": prompt}]
)
return response.choices[0].message.content
# Function to simulate human review
def human_review(ai_output: str) -> str:
print("AI output for review:", ai_output)
# In real workflow, this would be a UI or human input
correction = input("Enter correction or press Enter to accept: ")
return correction if correction else ai_output
# Main HITL workflow
prompt = "Summarize the key benefits of human in the loop AI workflows."
ai_output = ai_generate(prompt)
# Simple uncertainty check: flag if output contains 'maybe' or 'uncertain'
if any(word in ai_output.lower() for word in ["maybe", "uncertain", "possibly"]):
final_output = human_review(ai_output)
else:
final_output = ai_output
print("Final output:", final_output) output
AI output for review: Human in the loop AI workflows maybe improve accuracy by combining AI and human judgment. Enter correction or press Enter to accept: They improve accuracy by combining AI with human judgment. Final output: They improve accuracy by combining AI with human judgment.
Common variations
You can implement HITL workflows asynchronously or with streaming AI responses. Different models like claude-3-5-haiku-20241022 or gpt-4o-mini can be used depending on cost and latency needs.
For example, use async calls to handle multiple requests concurrently or stream partial AI outputs for faster human review.
import asyncio
import os
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
async def ai_generate_async(prompt: str) -> str:
stream = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": prompt}],
stream=True
)
result = ""
async for chunk in stream:
delta = chunk.choices[0].delta.content or ""
print(delta, end="", flush=True)
result += delta
print()
return result
async def main():
prompt = "Explain human in the loop AI workflows briefly."
output = await ai_generate_async(prompt)
# Here you could add human review logic
asyncio.run(main()) output
Human in the loop AI workflows combine automated AI with human oversight to improve accuracy and reliability.
Troubleshooting
- If you see authentication errors, verify your
OPENAI_API_KEYenvironment variable is set correctly. - If AI outputs are always accepted without review, check your uncertainty detection logic.
- For streaming issues, ensure your environment supports asynchronous iteration and you use the latest
openaiSDK.
Key Takeaways
- Use AI APIs to generate outputs and flag uncertain results for human review in HITL workflows.
- Implement simple heuristics or confidence thresholds to decide when human intervention is needed.
- Leverage async and streaming features for scalable and responsive HITL systems.