How to beginner · 3 min read

Computer use for data entry automation

Quick answer
Use the OpenAI API with the computer-use-2024-10-22 beta feature to automate data entry tasks by instructing the AI to interact with computer interfaces. This enables programmatic control for tasks like form filling and data extraction using Python code.

PREREQUISITES

  • Python 3.8+
  • OpenAI API key with access to computer use beta
  • pip install openai>=1.0

Setup

Install the openai Python package and set your OpenAI API key as an environment variable. Ensure you have access to the computer-use-2024-10-22 beta feature enabled on your account.

bash
pip install openai
output
Collecting openai
  Downloading openai-1.x.x-py3-none-any.whl
Installing collected packages: openai
Successfully installed openai-1.x.x

Step by step

This example demonstrates automating a simple data entry task by instructing the AI to fill a form on your computer using the computer-use-2024-10-22 tool. The AI receives a prompt and returns actions to perform.

python
import os
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

response = client.chat.completions.create(
    model="claude-3-5-haiku-20241022",
    betas=["computer-use-2024-10-22"],
    tools=[{
        "type": "computer_20241022",
        "name": "computer",
        "display_width_px": 1024,
        "display_height_px": 768
    }],
    messages=[
        {"role": "user", "content": "Automate entering the following data into the spreadsheet: Name: John Doe, Age: 30, Email: john@example.com"}
    ]
)

print("AI response:", response.choices[0].message.content)
output
AI response: [The AI returns a sequence of computer actions to enter the data into the spreadsheet application]

Common variations

You can use asynchronous calls with the OpenAI SDK or switch to different models like gpt-4o-mini if you want text-only automation. For streaming responses, set stream=True in the request. Adjust display_width_px and display_height_px to match your screen resolution.

python
import asyncio
import os
from openai import OpenAI

async def main():
    client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
    response = await client.chat.completions.create(
        model="claude-3-5-haiku-20241022",
        betas=["computer-use-2024-10-22"],
        tools=[{
            "type": "computer_20241022",
            "name": "computer",
            "display_width_px": 1280,
            "display_height_px": 720
        }],
        messages=[{"role": "user", "content": "Fill out the form with Name: Alice, Age: 25."}],
        stream=True
    )
    async for chunk in response:
        print(chunk.choices[0].delta.content or "", end="", flush=True)

asyncio.run(main())
output
[Streaming AI response with incremental computer actions]

Troubleshooting

  • If you get an authentication error, verify your OPENAI_API_KEY environment variable is set correctly.
  • If the AI does not respond with computer actions, ensure you included betas=["computer-use-2024-10-22"] and the tools parameter with the correct computer tool type.
  • For display issues, adjust display_width_px and display_height_px to match your actual screen resolution.

Key Takeaways

  • Use the OpenAI computer-use-2024-10-22 beta to automate data entry via AI-driven computer control.
  • Always include the betas and tools parameters to enable computer use features in your API calls.
  • Adjust screen resolution parameters to match your environment for accurate automation.
  • Async and streaming calls provide flexible integration options for real-time automation feedback.
Verified 2026-04 · claude-3-5-haiku-20241022, gpt-4o-mini
Verify ↗