How to bypass Cloudflare with Node.js

ZFC Software
4 min readJun 7, 2024

--

Cloudflare has been receiving a lot of updates lately and some methods no longer work. This content was created to help you learn how to avoid bot detection systems like Cloudflare with Node.js.

In this content we will cover 2 methods. One is related to Puppeteer and the other to selenium webdriver. Although Puppeteer is very useful, it is one of the focal points of bot detection systems.

Q: Is Puppeteer a replacement for Selenium WebDriver?

No. Both projects are valuable for very different reasons:

Selenium WebDriver focuses on cross-browser automation and provides bindings for multiple languages; Puppeteer is only for JavaScript.

Puppeteer focuses on Chromium; its value proposition is richer functionality for Chromium-based browsers.

When we examine Puppeteer’s FAQ page, we can see that this library is suitable for browser tests and it is recommended to use selenium webriver for automations.

By reviewing this Github issue, you can understand the issues and see the different uses of the methods discussed in this article.

How to Traverse Cloudflare with Puppeteer?

For this we will use targetFilter and puppeteer-extra-plugin-stealth. As we all know, the Stealth plugin aims to avoid detection by playing with Fingerprint settings. However, bot detection systems like Recaptcha v3, Cloudflare, Fingerprintjs cannot miss this library with 500k monthly downloads.

For this reason, they can all detect us even if we use them. This is where targetFilter comes in. targetFilter disables Puppeteer on selected targets. This allows us to return the actual Fingerprint values of the browser instead of the Fingerprint values that caused the detection.

const puppeteer = require('puppeteer-extra');
const StealthPlugin = require('puppeteer-extra-plugin-stealth');
puppeteer.use(StealthPlugin());

(async () => {

const browser = await puppeteer.launch({
headless: false,
targetFilter: target => !!target.url(),
});

var page = await browser.pages();
page = page[0];

await page.goto('https://nopecha.com/demo/cloudflare');

})();

With this simple setup, we can bypass many measures such as Cloudflare, Recaptcha v3. However, sometimes you may encounter problems such as the page going to timeOut or not being able to open a new page. In this case, you can write targetFilter as a separate function and add a toggle system.

 targetFilter: target => false,

A filter like this will completely disable Puppeteer. If you set your custom targetFilter function to return false when opening a new page, you can successfully create a new page.

How to Pass Cloudflare with selenium-webdriver?

We can easily pass it with a few additions. In selenium-webdriver, we only have problems with navigator.webdriver. We can easily fix this by adding the disable-blink-features=AutomationControlled property.

const { Builder, Browser, By, Key, until } = require('selenium-webdriver')
const chrome = require('selenium-webdriver/chrome');

(async function example() {

let options = new chrome.Options();
options.addArguments('--disable-blink-features=AutomationControlled');

let driver = await new Builder()
.forBrowser(Browser.CHROME)
.setChromeOptions(options)
.build()
await driver.get('https://nopecha.com/demo/cloudflare')

})()

Pre-built Libraries for Cloudflare

You may not want to deal with all this. As someone who has dealt with these problems for a long time, I have created some libraries so that you don’t have to.

This library creates a web server on port 3000. You can easily obtain WAF or Turnstile tokens by sending a request to this web server. It will return a header object for you to pass WAF successfully. You can use this header object in your direct requests.

This library can also be used with Puppeteer and contains additional modules.

Conclusion

I tried to make sure you skip many bot detection systems such as Cloudflare, Recaptcha v3. I may have my shortcomings and mistakes. I learn more every day and I try to share what I learn.

I did not create this article to cause damage to websites by bypassing Cloudflare. Such aggressive people will not create browsers in headles false mode. This consumes a lot of resources. This article is more for those who are interested in web scraping.

Thank you…

--

--

ZFC Software
0 Followers

You may see that my code is not perfectly written, but you cannot see that it does not work.