USE PYTHON TO AUTOMATE GOOGLE LIGHTHOUSE REPORTS AND KEEP A HISTORICAL RECORD OF THESE

Olimpiu Șeulean
5 min readOct 10, 2021

Following Google’s available information, if a page takes then 3 seconds to load, more than half the visitors abandon it.

This article will give you a brief introduction to a tool provided by Google — Google Lighthouse, and an implementation of the automation for this tool.

What is the Google Lighthouse tool/extension?

GLT is an open-source tool that evaluates web page speed, accessibility, best practices, and SEO. The automatic site monitoring tool was intended to assist web developers in adhering to Google’s best practices and standards.

Running Lighthouse in Google Chrome’s DevTools is pretty well known to most SEOs. Although using Lighthouse in the browser is simple and convenient, scaling it is still tough. What if you want Lighthouse to run on several sites every day? As is customary with Python, he steps in to save the day. This article will provide you with the essentials for automating your Lighthouse scans. It will be straightforward to adapt this lesson to meet your specific needs as a result of this tutorial.

One of the traditional ways of running Lighthouse

Environment:

* Python 3.X and basic syntax knowledge
* Node.js on your local machine
* Any OS

Starting the Automated Script:

We’re going to start by installing the Node CLI for the Lighthouse tool by using the node package manager. The Page Speed Insights API provides an alternative to running Lighthouse locally. If you’re coming from Google Colab, start with an exclamation point; otherwise, type the following into your terminal:

The next step is to install the required modules:

The high-level ratings (0–100% — most of the time or 0–1secs) will be saved in a JSON file. It’s worth knowing that this JSON file contains a variety of metrics. After you’ve run Lighthouse, open the JSON output file to check what’s feasible to use in your case.

Create needed variables:

Now we’ll build up a few simple variables that we’ll access throughout the process. We’ll label the out-put file using the name and DateTime Python function, then we will loop over the URLs list and run Lighthouse on them. You may either use the code to create a list or import the links from an Excel file (.csv)

Relatives path:

To save the reports you must set a path where you want to save the Excel + JSON results:

You will need to create two variables on how the files should be named.

You need to detect the active sheet and start processing it.

Initialize an array of links that you want to run the Lighthouse script on:

Also if you lean on importing the URLs through an excel file use the following code and convert the data frame into a list:

Set the ‘base’ object, in Python AKA — dictionary; for setting the Header for each (forEach- lovely ES6 ^^ JS), iteration in Excel based on the num_of_call (how many times you want to run the script over the links — explained below).

If the value is getting higher than 6 -> eg: num_of_calls = 7, then you need to create another key-value -> 7: ‘Seventh Run’ and so on.

RUN LIGHTHOUSE:

Start creating the function for running the LightHouse and setting up the logic for mobile and desktop reports:

Now is the time to go through that list of URLs and run Lighthouse! We are using the OS Python module to run Lighthouse from the CLI. See the documentation for details on all available options.

Since Python runs an application outside the script, we must stop the script and wait for Lighthouse to finish. I have found that 1 minute or 60 seconds works for most pages.

Once the pause is over and Lighthouse is also most likely over, let’s create the full path to the file so we can process it into the next snippet.

PROCESSING THE REPORT

Now let’s open the JSON report file and start processing it.

As I mentioned earlier, there is a ton of data in that JSON report file, and I advise you to read through it and select the items you wish to save. Keep in mind that these are scores out of 100. We multiply by 100 since Lighthouse saves these ratings as floats ranging from .00 to 1.

I decided to use a try-catch block, in case one of the items cannot be taken from the JSON file to not crash the script and replace the value with -.

Some comments from how the script is working so far:

Set the header again for the excel for breaking a space on the excel after each iteration:

Save the results into excels file and set how many times you want to run the script:

RESULTS:

lighthouse_desktop_10–09–21.csv (Results for the Desktop run)

CONCLUSION:

Lighthouse has become a common tool for SEOs to understand the health of their pages. It’s time to take Lighthouse to the next level and move beyond DevTools.

  1. Google Lighthouse is not just for developers; it’s for anybody who wants to enhance the overall speed of their website.
  2. To produce the Google Lighthouse Report, you do not need any coding knowledge. However, you will need access to the website’s code to make some of the necessary changes.
  3. Improve on the specific possibilities offered in each area to raise the score in that category.
  4. Attempt to achieve each category score as near to 90–100 as feasible to ensure that users have positive encounters with your site (and ideally with whatever you’re selling).

--

--