Using headless chrome via the websockets interface

Here’s a short tutorial on how to use headless Chrome by connecting via the websockets interface. The documentation from chrome is a bit lacking althought right now there are many libraries that provide an easier to use API for different languages.

The main advantage of using this over Selenium is that you get to access lower level methods in Chrome, making it easier to detect browser events and also parallelize by using different tabs.


How the headless Chrome protocol works

Chrome uses the dev tools protocol(https://chromedevtools.github.io/devtools-protocol/) over a web sockets connection. The protocol used for all messages is json and interactions between the client and chrome involve two kinds of messages:

Methods: These messages involve the requests you make to chrome. For example to ask chrome to browse to a page you would use the Page.navigate method, to execute some javascript in the target window you would use Runtime.evaluate, etc. Each method request needs to have a unique numerical identifier(1,2,3, etc) selected by the client so that chrome uses it when sending the results back.

Here’s an example of sending a navigate request:

Client message: {"method": "Page.navigate, id: 1, 
"params": {"url": "http://google.com"}
Browser response: {'id': 1, 'result': {'frameId': '13244.1'}}

Events: the devtools protocol also allows to subscribe to different type of events, for example to know when a page has finished loading, to know when an asset gets downloaded, when the dom is modified, etc.

By default chrome doesn’t send any events, the client needs to subscribe to receive different type of events. I’m leaving the details of events for another post as we won’t need them to get some simple code running.

Automating chrome with a simple python script

We’re going to implement a python script that opens chrome headlessly, navigates to google news and extracts news headings to display them in the command line.

Step 1: installing requirements

The main requirements for this project are:

  1. Python 3.5 or greater
  2. Chrome 59 or greater. (since it’s the first version that supports the headless argument)

You also need to install an external library to manage websockets connections and the requests library.

pip3 install websocket-client
pip3 install requests

We’re going to work in a file called google_news.py in the end of the post you’ll find the complete source code.

Step 2: Launching chrome with headless support

We’re going to create a function that given the path of chrome’s executable it will launch it an create a websockets connection to it.

import json
import time
import subprocess
import requests
from websocket import create_connection
def start_browser(browser_path, debugging_port):            
options = ['--headless',
'--remote-debugging-port={}'.format(debugging_port)]
browser_proc = subprocess.Popen([browser_path] + options)
wait_seconds = 10.0
sleep_step = 0.25
while wait_seconds > 0:
try:
url = 'http://127.0.0.1:{}/json'.format(debugging_port)
resp = requests.get(url).json()
ws_url = resp[0]['webSocketDebuggerUrl']
return browser_proc, create_connection(ws_url)
except requests.exceptions.ConnectionError:
time.sleep(sleep_step)
wait_seconds -= sleep_step
raise Exception('Unable to connect to chrome')

The start_browser function receives the browser’s path as the first argument and chrome’s remote debugging port as the second argument. It launches the browser process with headless mode activated and after that it waits until the browser has launched. It gets the default tab’s websocket address via an http request and finally connects to the tab via a websocket.

The function returns a tuple with the process and the websocket connection.

Step 3: sending commands to chrome

Now we need to send a command to chrome, the run_command function receives the method to run and also additional keyword parameters for the command. We store the current request id as a global variable for simplicity.

request_id = 0
def run_command(conn, method, **kwargs):
global request_id
request_id += 1
command = {'method': method,
'id': request_id,
'params': kwargs}
conn.send(json.dumps(command))
while True:
msg = json.loads(conn.recv())
if msg.get('id') == request_id:
return msg

For example to send chrome a message to navigate to a page you would use:

result = run_command(conn, "Page.navigate", url="http://google.com")

Step 4: getting news headlines from google

Now that we have the base functions to initialize and send commands to Chrome we can test it by interacting with a page. We’re going to navigate to google news(https://news.google.com/news/?ned=us&hl=en) and run some javascript magic to get the news headlines in the page.

gnews_url = 'https://news.google.com/news/?ned=us&hl=en'
# you need to set your machine's chrome path here
chrome_path = '/usr/bin/google-chrome-unstable'
browser, conn = start_browser(chrome_path, 9222)
run_command(conn, 'Page.navigate', url=gnews_url)
time.sleep(5) # let it load
js = """
var sel = '[role="heading"][aria-level="2"]';
var headings = document.querySelectorAll(sel);
headings = [].slice.call(headings).map((link)=>{return link.innerText});
JSON.stringify(headings);
"""
result = run_command(conn, 'Runtime.evaluate', expression=js)
headings = json.loads(result['result']['result']['value'])
for heading in headings:
print(heading)

And that’s it, this is a simple example and there’s a lot more things that can be done with chrome in headless mode. As you can see the example runs requests synchronously while a better implementation would send and receive messages in an asynchronous way. Also it’s useful to be able to receive events from chrome to know when a page loads instead of hardcoding the amount of time to wait.

For more information on the debugging protocol you can refer to https://developers.google.com/web/updates/2017/04/headless-chrome

Here’s the complete code, you need to override the chrome_path variable to the path of chrome in your machine.

import json
import time
import subprocess
import requests
from websocket import create_connection
def start_browser(browser_path, debugging_port):
options = ['--headless',
'--remote-debugging-port={}'.format(debugging_port)]
browser_proc = subprocess.Popen([browser_path] + options)
wait_seconds = 10.0
sleep_step = 0.25
while wait_seconds > 0:
try:
url = 'http://127.0.0.1:{}/json'.format(debugging_port)
resp = requests.get(url).json()
ws_url = resp[0]['webSocketDebuggerUrl']
return browser_proc, create_connection(ws_url)
except requests.exceptions.ConnectionError:
time.sleep(sleep_step)
wait_seconds -= sleep_step
raise Exception('Unable to connect to chrome')
request_id = 0
def run_command(conn, method, **kwargs):
global request_id
request_id += 1
command = {'method': method,
'id': request_id,
'params': kwargs}
conn.send(json.dumps(command))
while True:
msg = json.loads(conn.recv())
if msg.get('id') == request_id:
return msg
gnews_url = 'https://news.google.com/news/?ned=us&hl=en'
chrome_path = '/usr/bin/google-chrome-unstable'
browser, conn = start_browser(chrome_path, 9222)
run_command(conn, 'Page.navigate', url=gnews_url)
time.sleep(5) # let it load
js = """
var sel = '[role="heading"][aria-level="2"]';
var headings = document.querySelectorAll(sel);
headings = [].slice.call(headings).map((link)=>{return link.innerText});
JSON.stringify(headings);
"""
result = run_command(conn, 'Runtime.evaluate', expression=js)
headings = json.loads(result['result']['result']['value'])
for heading in headings:
print(heading)
browser.terminate()