Connecting Puppeteer
Puppeteer can connect to Browserless through Chrome's DevTools Protocol (CDP) using websockets. This is the primary and recommended way to connect to Browserless, as it provides a stable and reliable connection.
Once you have gotten your API key from the Browserless dashboard, you can connect Puppeteer to Browserless.
Below are the examples of javascript and python scripts using browserless:
- Javascript
- Python
import puppeteer from "puppeteer-core";
// Connecting to Chrome locally
const browser = await puppeteer.launch();
// Connecting to Browserless
const browser = await puppeteer.connect({
browserWSEndpoint: `wss://production-sfo.browserless.io?token=${TOKEN}`
});
import asyncio
from pyppeteer import connect_over_cdp
async def main():
# Connecting to Chrome locally
browser = await launch()
# Connecting to Browserless
browser = await connect_over_cdp(
endpoint=f"wss://production-sfo.browserless.io?token={TOKEN}"
)
Migrating from local Chrome to Browserless
To migrate from local Chrome to Browserless, the main change is switching from puppeteer.launch()
to puppeteer.connect()
with a WebSocket endpoint. It is recommended to use puppeteer-core
instead of puppeteer
as it's more lightweight and doesn't include Chromium binaries:
Before browserless
import puppeteer from "puppeteer";
const browser = await puppeteer.launch();
const page = await browser.newPage();
// ...
After browserless
import puppeteer from "puppeteer-core";
const browser = await puppeteer.connect({
browserWSEndpoint: `wss://production-sfo.browserless.io?token=YOUR_API_TOKEN_HERE`,
});
const page = await browser.newPage();
// ...
Example: Getting Page Title
Here's a sample example that demonstrates how to use Puppeteer with Browserless to get the title of a page
import puppeteer from "puppeteer-core";
async function main() {
const url = "https://www.example.com";
const token = "YOUR_API_TOKEN_HERE";
const browser = await puppeteer.connect({
browserWSEndpoint: `wss://production-sfo.browserless.io/?token=${token}`,
});
const page = await browser.newPage();
await page.goto("https://www.example.com");
const title = await page.title();
console.log(`The page's title is: ${title}`);
await browser.close();
}
main().catch(error => {
console.error("Unhandled error in main function:", error);
});
If your Puppeteer scripts are getting blocked by bot detectors, you can use BrowserQL to generate a browser instance with advanced stealth features, that you can then connect to with the reconnect
method.
Reduce await
's as much as possible
Most Puppeteer operations are async, meaning each await
command makes a network round-trip from your application to Browserless and back. These network round-trips add latency and slow down your automation. To minimize this, batch operations together whenever possible.
The key principle: use page.evaluate()
to run multiple DOM operations in a single round-trip, rather than making separate calls for each operation. Inside page.evaluate()
, code runs directly in the browser context, so multiple DOM queries and manipulations happen without additional network calls.
Examples
DON'T DO (3 network round-trips)
const $button = await page.$(".buy-now");
const buttonText = await $button.getProperty("innerText");
const clicked = await $button.click();
DO (1 network round-trip)
const buttonText = await page.evaluate(() => {
const $button = document.querySelector(".buy-now");
$button.click();
return $button.innerText;
});
Using Proxies
When using Puppeteer with Browserless, you can set up a proxy by adding the proxy parameter to your connection URL, you can also geolocate your IP address with the proxyCountry parameter.
import puppeteer from "puppeteer-core";
// Connect with residential proxy located in the US
const browser = await puppeteer.connect({
browserWSEndpoint: `wss://production-sfo.browserless.io/?token=${TOKEN}&proxy=residential&proxyCountry=us`,
});
const page = await browser.newPage();
// Visit a site that shows your IP address to verify proxy is working
await page.goto("https://icanhazip.com/");
console.log(await page.content());
await browser.close();
Using one or both of these flags will cost 6 units per MB of traffic, so consider rejecting media to save on MB consumed.
// Enable request interception before navigating to any page
await page.setRequestInterception(true);
page.on('request', (request) => {
// Block image resources to reduce bandwidth usage and costs
if (request.url().endsWith('.jpg') || request.url().endsWith('.png')) {
request.abort();
} else {
// Allow all other requests to proceed
request.continue()
}
});
For more detailed information about using in-built or third party proxies with Puppeteer, see our Proxies documentation.