visit
In this guide, we’ll introduce you to puppeteer-extra
—a library that wraps puppeteer
to extend it with plugin support. Get ready to take your Puppeteer scraping project to the next level! 🚀
is a lightweight wrapper around puppeteer
that enables plugin integration through a clean interface. Although it's not developed by the team behind , this community-driven project has hundreds of thousands of weekly downloads and over 6k stars on GitHub 📈.
Check out the —it’s clear that the puppeteer-extra
repo has been on a steady rise in popularity over the years:
User-Agent
header on all pages, with support for dynamic replacing.No doubt, Puppeteer is one of the top headless . But let’s be honest—it has its limits, especially when facing anti-bot tech like browser fingerprinting and CAPTCHAs. Read our guide to learn how to deal with reCAPTCHA automation.
Puppeteer Extra is like a power-up for Puppeteer, adding plugin support to tackle those major drawbacks. Instead of overriding or extending everything for you, it wraps Puppeteer and lets you register only the plugins you need. 🦸
puppeteer-extra
: Setup and Plugins for Web Scrapingnpm install puppeteer-extra
⚠️ Note: puppeteer-extra
requires puppeteer
to work, so make sure both packages are installed in your project.
Then, you have to import the puppeteer
object from puppeteer-extra
instead of the puppeteer
library:
const puppeteer = require("puppeteer-extra")
// for ESM users:
// const { puppeteer } from "puppeteer-extra"
Everything in the Puppeteer API stays the same, but you get a little extra magic ✨. The puppeteer
object now exposes a use()
method to plug in Puppeteer Extra plugins.
⚙️ Installation:
npm install puppeteer-extra-plugin-stealth
💡 Usage:
const StealthPlugin = require("puppeteer-extra-plugin-stealth")
// for ESM users:
// import StealthPlugin from "puppeteer-extra-plugin-stealth"
puppeteer.use(StealthPlugin())
A plugin to prevent the Puppeteer browser from loading specific resources. The supported resource types include document
, stylesheet
, image
, media
, font
, script
, texttrack
, xhr
, fetch
, eventsource
, websocket
, manifest
, other
.
⚙️ Installation:
npm install puppeteer-extra-plugin-block-resources
💡 Usage:
const BlockResourcesPlugin = require("puppeteer-extra-plugin-block-resources")
// for ESM users:
// import BlockResourcesPlugin from "puppeteer-extra-plugin-block-resources"
puppeteer.use(BlockResourcesPlugin({
blockedTypes: new Set(["image", "stylesheet"]),
}))
puppeteer.use(BlockResourcesPlugin()
const browser = await puppeteer.launch()
const page = await browser.newPage()
blockResourcesPlugin.blockedTypes.add("stylesheet")
await page.goto("//www.example.com/", { waitUntil: "domcontentloaded" })
A plugin to anonymize the User-Agent
set by the browser controlled by Puppeteer. 🎭
It gives you the ability to strip the 'Headless'
string from the Chrome user agent in headless mode and supports dynamic replacement of the user agent through a custom function. See it in action in our .
⚙️ Installation:
npm install puppeteer-extra-plugin-anonymize-ua
💡 Usage:
const AnonymizeUAPlugin = require("puppeteer-extra-plugin-anonymize-ua")
// for ESM users:
// import AnonymizeUAPlugin from "puppeteer-extra-plugin-anonymize-ua"
puppeteer.use(AnonymizeUAPlugin({
stripHeadless: true,
}))
puppeteer.use(AnonymizeUAPlugin({
customFn: (ua) => ua.replace("Chrome", "Chromium")})
}))
Just like with Playwright, no matter how slick and customized your Puppeteer script is, advanced anti-bot systems can still sniff you out and shut you down. But how is that even possible? 🤔
The puppeteer-extra-stealth-plugin
breaks it down for you:
Please note: I consider this a friendly competition in a rather interesting cat and mouse game. If the other team (👋) wants to detect headless chromium there are still ways to do that (at least I noticed a few, which I'll tackle in future updates).It's probably impossible to prevent all ways to detect headless chromium, but it should be possible to make it so difficult that it becomes cost-prohibitive or triggers too many false-positives to be feasible.
So, while Puppeteer Extra can dodge most basic bot detection like Neo in Matrix, it can’t surely bypass Cloudflare. Sure, you could , but even that might not be enough.
Puppeteer is one of the most widely used browser automation tools in the tech world, but even superheroes have their limits. The community stepped in with puppeteer-extra
, a package that gives Puppeteer some seriously cool new abilities through custom plugins.