Selenium + Chrome Dev-tools makes a Perfect Browser Automation Recipe

AMIT RAWAT
5 min readNov 16, 2017

I was just browsing the Twitter and came accross a tweet mentioning that Chrome 63 is coming with Multi-client remote debugging support.

In sheer excitement I also replied to this tweet.

You must be wondering why I was so excited by this feature. It made me excited as it opens up hell lot of possibilities for people like me who works in Test Automation.

Recently I struggled a lot to solve some of the Browser automation issues where I found Selenium WebDriver was not perfectly equipped to solve them.

The issues like:

  • Handling HTTP Basic Authentication: Those who are not familiar with this, please try to open this link and it will show you a browser pop-up asking for credentials. Please enter admin as username and password and it will authenticate you. One solution to handle it was passing the credentials in URL itself but sometimes it doesn’t work when there are re-directs.
  • Getting events from browser like when specific network call completed.
  • Setting the fake geo-location from your WebDriver scripts.
  • Updating the user-agent
  • Mocking the web response by intercepting the traffic
  • Simulating network bandwidth especially for Mobile Chrome
  • Getting Screencast of my chrome session: It is possible to get screencast of the whole screen using vnc but if multiple chrome sessions are running than it is not possible.

The list is long and there are no easy ways to solve these issues. Some of the above issues can be solved by injecting a chrome extension and some can be solved by introducing a proxy server.

I will try to explain how we can solve these problems by using Chrome’s remote debugging interface. Just to add, Chrome team recently released a library called Puppeteer which gives an api in nodejs/javascript to do similar things which I would be showing in this article. In case your technology stack is different from nodejs than you can continue reading this article.

To give you all some back ground on how chromedriver.exe server controls the chrome when we send our WebDriver commands in our Selenium tests.

Till chrome 57 version, Chromedriver was using an automation extension which gets injected into your chrome and it helps to facilitate the communication between the chromedriver and the chrome.

But from Chrome 58 onwards they have removed the use of this extension and now everything is controlled by Chrome’s dev tool api which uses WebSocket communication and also supports remote debugging. After this change you will start seeing the message that chrome is getting controlled by this remote protocol.

Let’s go little deeper in to this Chrome’s remote debugging protocol. This chrome dev-tools web-socket api gives you capability to control any local or remote browser but till Chrome version 62 only one client can connect to this interface, which means when selenium is controlling the browser, no other client can debug it.

From chrome 63 which will be released on 5th December 2017, we can have multiple clients connected to chrome and debug it.

Enough talking, lets get our hands dirty with some Java code.

Launching a Chrome browser using ChromeDriverService as we would need to parse some chromedriver logs to get the port on which the remote debugger is running. Here one thing to be noticed is that I am pointing to chromium nightly build which is equivalent to Chrome 63 as it not yet released. You can also use Canary builds.

Now we will extract the websocket port where the debugger is running. Actually we can run it on specific port by this command line switch but at present selenium throws exception when we launch ChromeDriver with this flag.

--remote-debugging-port=9222

Here is the code to get this port:

In the above code, we will get the port and will hit this url:

http://localhost:<port>/json

This will return the following json:

[ {
"description": "",
"devtoolsFrontendUrl": "/devtools/inspector.html?ws=localhost:12326/devtools/page/4f500019-edde-4c1e-a464-699638eb5fce",
"id": "4f500019-edde-4c1e-a464-699638eb5fce",
"title": "data:,",
"type": "page",
"url": "data:,",
"webSocketDebuggerUrl": "ws://localhost:12326/devtools/page/4f500019-edde-4c1e-a464-699638eb5fce"
} ]

We are interested in the json attribute “webSocketDebuggerUrl” in the above json.

Now we can connect to this websocket url using any any java library of our choice. I used this one (nv-websocket-client).

Let’s see how we can solve some of the issues which I mentioned above.

Setting the fake geo-location:

We simply need to send this json message to chrome, where

  • id:could be any unique number, it helps to track your response as everything is asynchronous here.
  • method: is the Operation/Function name. Here you can find all the possible supported operations.
{
"id": 3,
"method": "Emulation.setGeolocationOverride",
"params": {
"latitude": 27.1752868,
"longitude": 78.040009,
"accuracy": 100
}
}

The response of the above command would be something like this:

{ "id": 3, "result": { } }

This simple message has done all the magic and now try to browse maps.google.com and you will see your current location as Taj Mahal which is a famous monument and also at my native place Agra.

Below is the code I used to send this websocket message. Just for this demo I am making sure in code, things are synchronous so it waits till the time I get response of my command.

Handling HTTP Basic Authentication:

Now, as we know the trick of controlling the Webdriver Chrome session by sending these websocket messages. All the hard work is already done, we just need to figure out the json message for each use case.

Before playing around with any of the network related messages, as a prerequisite we have to enable network debugging by this message:

{
"id": 1,
"method": "Network.enable",
"params": {
"maxTotalBufferSize": 10000000,
"maxResourceBufferSize": 5000000
}
}

Below is the json for appending an extra http header which will take care of handling http basic authentication.

{
"id": 2,
"method": "Network.setExtraHTTPHeaders",
"params": {
"headers": {
"Authorization": "Basic YWRtaW46YWRtaW4="
}
}
}

Network Emulation:

Sometimes it is very handy to see how our web application works in offline mode or in fluctuating network. Here is the message:

{
"id": 7,
"method": "Network.emulateNetworkConditions",
"params": {
"offline": true,
"latency": 300,
"downloadThroughput": 250,
"uploadThroughput": 750
}
}

Overriding UserAgent:

Json message for doing this:

{
"id": 7,
"method": "Network.setUserAgentOverride",
"params": {
"userAgent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.81 Safari/537.36"
}
}

I will leave you all to figure out the json messages for the remaining use-cases. I would be happy to answer in comments section in case you need any help. Here is the complete working code available. Please press the clap icon in case you liked this article which will encourage me to write more content.

--

--

AMIT RAWAT

I am a Civil Engineer by qualification, an Engineering Manager by profession and a Developer by passion. (amitrawat.dev)