How to use Browser Use with OpenAI
Quick answer
Use the
browser-use package with an OpenAI API key to automate browser tasks by combining Agent from browser_use and ChatOpenAI from langchain_openai. Instantiate Agent with a task and an LLM instance, then run it asynchronously to get the result.PREREQUISITES
Python 3.8+OpenAI API key (free tier works)pip install openai>=1.0pip install browser-use langchain_openaiPlaywright installed with chromium (run: playwright install chromium)
Setup
Install the required packages and set your OpenAI API key as an environment variable. Also, install Playwright's Chromium browser for automation.
pip install openai browser-use langchain_openai
playwright install chromium output
Collecting openai Collecting browser-use Collecting langchain_openai Installing collected packages... Playwright browsers installed successfully.
Step by step
This example shows how to create a simple browser automation task that searches Google for 'AI news' using Agent from browser_use and ChatOpenAI from langchain_openai. The code runs asynchronously and prints the result.
import asyncio
from browser_use import Agent
from langchain_openai import ChatOpenAI
import os
async def main():
llm = ChatOpenAI(model="gpt-4o-mini", openai_api_key=os.environ["OPENAI_API_KEY"])
agent = Agent(task="Go to google.com and search for 'AI news'", llm=llm)
result = await agent.run()
print("Result:", result)
if __name__ == "__main__":
asyncio.run(main()) output
Result: [Agent's summarized output of the search or browser interaction]
Common variations
- Use different LLM models by changing
modelinChatOpenAI, e.g.,gpt-4o-mini. - Run synchronously by wrapping async calls with
asyncio.run()or use an async framework. - Customize browser options or tasks by passing additional parameters to
Agent.
Troubleshooting
- If you get
playwright not installederrors, runplaywright install chromium. - Ensure
OPENAI_API_KEYis set in your environment before running the script. - For permission errors, verify Python environment and package versions are compatible.
Key Takeaways
- Use
Agentfrombrowser_usewithChatOpenAIfor browser automation powered by OpenAI. - Install Playwright and Chromium to enable browser control.
- Run the agent asynchronously with
asyncio.run()for best results. - Customize the LLM model and task easily via parameters.
- Set environment variables securely; never hardcode API keys.