Computer use for data entry automation
Quick answer
Use the
OpenAI API with the computer-use-2024-10-22 beta feature to automate data entry tasks by instructing the AI to interact with computer interfaces. This enables programmatic control for tasks like form filling and data extraction using Python code.PREREQUISITES
Python 3.8+OpenAI API key with access to computer use betapip install openai>=1.0
Setup
Install the openai Python package and set your OpenAI API key as an environment variable. Ensure you have access to the computer-use-2024-10-22 beta feature enabled on your account.
pip install openai output
Collecting openai Downloading openai-1.x.x-py3-none-any.whl Installing collected packages: openai Successfully installed openai-1.x.x
Step by step
This example demonstrates automating a simple data entry task by instructing the AI to fill a form on your computer using the computer-use-2024-10-22 tool. The AI receives a prompt and returns actions to perform.
import os
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
response = client.chat.completions.create(
model="claude-3-5-haiku-20241022",
betas=["computer-use-2024-10-22"],
tools=[{
"type": "computer_20241022",
"name": "computer",
"display_width_px": 1024,
"display_height_px": 768
}],
messages=[
{"role": "user", "content": "Automate entering the following data into the spreadsheet: Name: John Doe, Age: 30, Email: john@example.com"}
]
)
print("AI response:", response.choices[0].message.content) output
AI response: [The AI returns a sequence of computer actions to enter the data into the spreadsheet application]
Common variations
You can use asynchronous calls with the OpenAI SDK or switch to different models like gpt-4o-mini if you want text-only automation. For streaming responses, set stream=True in the request. Adjust display_width_px and display_height_px to match your screen resolution.
import asyncio
import os
from openai import OpenAI
async def main():
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
response = await client.chat.completions.create(
model="claude-3-5-haiku-20241022",
betas=["computer-use-2024-10-22"],
tools=[{
"type": "computer_20241022",
"name": "computer",
"display_width_px": 1280,
"display_height_px": 720
}],
messages=[{"role": "user", "content": "Fill out the form with Name: Alice, Age: 25."}],
stream=True
)
async for chunk in response:
print(chunk.choices[0].delta.content or "", end="", flush=True)
asyncio.run(main()) output
[Streaming AI response with incremental computer actions]
Troubleshooting
- If you get an authentication error, verify your
OPENAI_API_KEYenvironment variable is set correctly. - If the AI does not respond with computer actions, ensure you included
betas=["computer-use-2024-10-22"]and thetoolsparameter with the correct computer tool type. - For display issues, adjust
display_width_pxanddisplay_height_pxto match your actual screen resolution.
Key Takeaways
- Use the OpenAI
computer-use-2024-10-22beta to automate data entry via AI-driven computer control. - Always include the
betasandtoolsparameters to enable computer use features in your API calls. - Adjust screen resolution parameters to match your environment for accurate automation.
- Async and streaming calls provide flexible integration options for real-time automation feedback.