Using HTTP proxy with Puppeteer

Gajus Kuizinas
The Startup
Published in
2 min readJan 30, 2020
Japanese dolls

I had requirement to evaluate remote JavaScript using Headless Chrome, but requests had to be routed through an internal proxy and different proxies had to be used for different URLs. A convoluted requirement perhaps, but the last bit describes an important feature that Puppeteer is lacking: switching HTTP proxy for each Page/ Request.

However, it turns out that even if the feature is lacking, it is easy to implement an entirely custom HTTP request/ response handling using Puppeteer. All you need is:

  1. Enable request/ response interception using page.setRequestInterception(true).
  2. Intercept request
  3. Make request using Node.js
  4. Return response to Chrome

This way Chrome itself never makes an outgoing HTTP request and all requests can be handled using Node.js.

The basic functionality is simple to implement:

import puppeteer from 'puppeteer';
import got from 'got';
import HttpProxyAgent from 'http-proxy-agent';
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
// 1. Enable request/ response interception
await page.setRequestInterception(true);
// 2. Intercept request
page.once('request', async (request) => {
// 3. Make request using Node.js
const response = await got(request.url(), {
// HTTP proxy.
agent: new HttpProxyAgent('http://127.0.0.1:3000'),
body: request.postData(),
headers: request.headers(),
method: request.method(),
retry: 0,
throwHttpErrors: false,
});
// 4. Return response to Chrome
await request.respond({
body: response.body,
headers: response.headers,
status: response.statusCode,
});
});
await page.goto('http://gajus.com');
})();

It gets a bit trickier if you require to support HTTPS, error and cookie handling. However, as of last night, there is a package for that: puppeteer-proxy.

puppeteer-proxy abstracts HTTP proxy handling for Puppeteer, including HTTPS support, error and cookie handling. Using it is simple:

import puppeteer from 'puppeteer';
import {
createPageProxy,
} from 'puppeteer-proxy';
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
const pageProxy = createPageProxy({
page,
});
await page.setRequestInterception(true); page.once('request', async (request) => {
await pageProxy.proxyRequest({
request,
proxyUrl: 'http://127.0.0.1:3000',
});
});
await page.goto('http://gajus.com');
})();

--

--

Gajus Kuizinas
The Startup

Founder, engineer interested in JavaScript, PostgreSQL and DevOps. Follow me on Twitter for outbursts about startups & engineering. https://twitter.com/kuizinas