Add Browser Capabilities to AI with Browser Use
Browser Use is a Python library that allows AI agents to control a browser. By integrating Browserless with Browser Use, you can provide your AI applications with powerful web browsing capabilities without managing browser infrastructure.
Prerequisites
- Python 3.11 or higher
- An active Browserless API Token (available in your account dashboard)
Step-by-Step Setup
Go to your Browserless Account Dashboard and copy your API token.
Then set the BROWSERLESS_API_TOKEN
environment variable in your .env
file:
- .env file
- Command line
BROWSERLESS_API_TOKEN=your-token-here
OPENAI_API_KEY=your-openai-key-here
export BROWSERLESS_API_TOKEN=your-token-here
export OPENAI_API_KEY=your-openai-key-here
Set up a Python virtual environment to manage your dependencies:
- uv (recommended)
- venv
- conda
uv venv --python 3.11
source .venv/bin/activate # On Windows: .venv\Scripts\activate
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
conda create -n browser-use-env python=3.11
conda activate browser-use-env
Install Browser Use and other required packages:
- uv (recommended)
- pip
- Poetry
uv add browser-use python-dotenv
pip install browser-use python-dotenv
poetry add browser-use python-dotenv
Create a new file named main.py
with the following complete code:
from browser_use import Agent, BrowserSession
from browser_use.llm import ChatOpenAI
from dotenv import load_dotenv
import os
import asyncio
load_dotenv()
async def main():
# Get the token from environment variables
browserless_token = os.getenv('BROWSERLESS_API_TOKEN')
if not browserless_token:
raise ValueError("BROWSERLESS_API_TOKEN environment variable is required")
browser_session = BrowserSession(cdp_url=f"wss://production-sfo.browserless.io?token={browserless_token}")
llm = ChatOpenAI(model="gpt-4o")
agent = Agent(
task="find me the top cheapest trainer on ebay.co.uk",
llm=llm,
browser_session=browser_session
)
result = await agent.run()
print(result)
if __name__ == "__main__":
asyncio.run(main())
Run your application with the following command:
python main.py
You should see output indicating that the browser is initialized and the agent is running.
How It Works
1. Connection Setup: Browser Use connects to Browserless using the WebSocket endpoint with your API token 2. Agent Configuration: The AI agent is configured with a task and a language model 3. Automation: The agent uses the browser to navigate and interact with websites 4. LLM Integration: The agent leverages an LLM (like GPT-4o) to interpret web content and make decisions
Additional Configuration
Proxy Support
You can enable a residential proxy for improved website compatibility:
browser_session = BrowserSession(cdp_url=f"wss://production-sfo.browserless.io?token={os.environ['BROWSERLESS_API_TOKEN']}&proxy=residential")
Browser Session Configuration
Customize the browser session with additional settings:
from browser_use import BrowserSession
from browser_use.browser import BrowserProfile
browser_session = BrowserSession(
cdp_url=f"wss://production-sfo.browserless.io?token={os.environ['BROWSERLESS_API_TOKEN']}",
browser_profile=BrowserProfile(
user_agent="Custom User Agent",
viewport_size={"width": 1920, "height": 1080},
headless=True,
)
)
Advanced Features
CDP Events and LiveURL
Browserless provides powerful Chrome DevTools Protocol (CDP) events that can enhance your browser automation. Here are some key features:
-
LiveURL for User Interaction
# Create a CDP session
cdp = await page.createCDPSession()
# Generate a LiveURL for user interaction
response = await cdp.send('Browserless.liveURL', {
"timeout": 600000 # 10 minutes timeout
})
live_url = response["liveURL"]
print(f"Share this URL with users: {live_url}")
# Wait for user to complete interaction
future = asyncio.Future()
cdp.on('Browserless.liveComplete', lambda: future.set_result(True))
await futureFor more details, see our LiveURL Documentation.
-
Captcha Detection
# Listen for captcha detection
cdp.on('Browserless.captchaFound', lambda: print('Captcha detected!'))
# Solve captcha automatically
response = await cdp.send('Browserless.solveCaptcha', {
"appearTimeout": 20000
})
solved, error = response.get("solved"), response.get("error")Learn more about handling captchas in our Hybrid Automation Guide.
-
Session Recording
# Start recording the session
await cdp.send("Browserless.startRecording")
# ... perform actions ...
# Stop recording and save
response = await cdp.send("Browserless.stopRecording")
with open("recording.webm", "wb") as f:
f.write(response.value)See our Recording Documentation for more details.
Complete Example with CDP Events
Here's a complete example that combines LiveURL, captcha handling, and session recording:
from browser_use import Agent, BrowserSession
from browser_use.llm import ChatOpenAI
import asyncio
import os
async def main():
browserless_token = os.getenv('BROWSERLESS_API_TOKEN')
browser_session = BrowserSession(cdp_url=f"wss://production-sfo.browserless.io?token={browserless_token}")
try:
# Start the browser session
await browser_session.start()
page = await browser_session.get_current_page()
# Create CDP session
cdp = await page.createCDPSession()
# Start recording
await cdp.send("Browserless.startRecording")
# Generate LiveURL
response = await cdp.send('Browserless.liveURL', {
"timeout": 600000
})
live_url = response["liveURL"]
print(f"Share this URL with users: {live_url}")
# Handle captcha if detected
cdp.on('Browserless.captchaFound', lambda: print('Captcha detected!'))
# Wait for user interaction
future = asyncio.Future()
cdp.on('Browserless.liveComplete', lambda: future.set_result(True))
await future
# Stop recording and save
response = await cdp.send("Browserless.stopRecording")
with open("recording.webm", "wb") as f:
f.write(response.value)
finally:
await browser_session.close()
if __name__ == "__main__":
asyncio.run(main())
For more information about CDP events and features, please refer to:
Advanced Usage
For more advanced usage scenarios, please refer to: