How to beginner · 3 min read

Computer use for data entry automation

Q: Computer use for data entry automation

Use the OpenAI API with the computer-use-2024-10-22 beta feature to automate data entry tasks by instructing the AI to interact with computer interfaces. This enables programmatic control for tasks like form filling and data extraction using Python code.

Quick answer

Use the OpenAI API with the computer-use-2024-10-22 beta feature to automate data entry tasks by instructing the AI to interact with computer interfaces. This enables programmatic control for tasks like form filling and data extraction using Python code.

PREREQUISITES

Python 3.8+
OpenAI API key with access to computer use beta
pip install openai>=1.0

Setup

Install the openai Python package and set your OpenAI API key as an environment variable. Ensure you have access to the computer-use-2024-10-22 beta feature enabled on your account.

bash

pip install openai

output

Collecting openai
  Downloading openai-1.x.x-py3-none-any.whl
Installing collected packages: openai
Successfully installed openai-1.x.x

Step by step

This example demonstrates automating a simple data entry task by instructing the AI to fill a form on your computer using the computer-use-2024-10-22 tool. The AI receives a prompt and returns actions to perform.

python

import os
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

response = client.chat.completions.create(
    model="claude-3-5-haiku-20241022",
    betas=["computer-use-2024-10-22"],
    tools=[{
        "type": "computer_20241022",
        "name": "computer",
        "display_width_px": 1024,
        "display_height_px": 768
    }],
    messages=[
        {"role": "user", "content": "Automate entering the following data into the spreadsheet: Name: John Doe, Age: 30, Email: john@example.com"}
    ]
)

print("AI response:", response.choices[0].message.content)

output

AI response: [The AI returns a sequence of computer actions to enter the data into the spreadsheet application]

Common variations

You can use asynchronous calls with the OpenAI SDK or switch to different models like gpt-4o-mini if you want text-only automation. For streaming responses, set stream=True in the request. Adjust display_width_px and display_height_px to match your screen resolution.

python

import asyncio
import os
from openai import OpenAI

async def main():
    client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
    response = await client.chat.completions.create(
        model="claude-3-5-haiku-20241022",
        betas=["computer-use-2024-10-22"],
        tools=[{
            "type": "computer_20241022",
            "name": "computer",
            "display_width_px": 1280,
            "display_height_px": 720
        }],
        messages=[{"role": "user", "content": "Fill out the form with Name: Alice, Age: 25."}],
        stream=True
    )
    async for chunk in response:
        print(chunk.choices[0].delta.content or "", end="", flush=True)

asyncio.run(main())

output

[Streaming AI response with incremental computer actions]

Troubleshooting

If you get an authentication error, verify your OPENAI_API_KEY environment variable is set correctly.
If the AI does not respond with computer actions, ensure you included betas=["computer-use-2024-10-22"] and the tools parameter with the correct computer tool type.
For display issues, adjust display_width_px and display_height_px to match your actual screen resolution.

Key Takeaways

Use the OpenAI computer-use-2024-10-22 beta to automate data entry via AI-driven computer control.
Always include the betas and tools parameters to enable computer use features in your API calls.
Adjust screen resolution parameters to match your environment for accurate automation.
Async and streaming calls provide flexible integration options for real-time automation feedback.

Verified 2026-04 · claude-3-5-haiku-20241022, gpt-4o-mini

Verify ↗

Community Notes

No notes yetBe the first to share a version-specific fix or tip.