DATA STORIES | WEBSITE MONITORING | KNIME ANALYTICS PLATFORM

How to monitor the performance of your websites with KNIME and the Google PageSpeed Insights API

A low-code solution to easily benchmark the performance of your website with that of your competitors

gio_bi
Low Code for Data Science

--

The workflow of this article uses the PageSpeed Insights API to collect data on a site’s performance and usability. The workflow is available for free on the KNIME Community Hub (see below).

Compare your site’s performance with that of your competitors via KNIME and the PageSpeed Insights API.

Before we begin, a few words about KNIME

KNIME is an open-source data analytics software that relies on a low-code paradigm to foster data literacy, as well as streamline the creation and execution of any data analysis. It has several strengths:

  • Ease of learning: KNIME is extremely intuitive and allows you to get your first results in no time, even after just a few hours of use.
  • Online learning resources: a wide range of video courses and e-books are available online to fully master this software. You can even choose to obtain professional certifications in the use of KNIME.
  • Collaborative community: KNIME has an active and collaborative community. In the forum, it is easy to find the solution to the problem that is holding you back, and if not, there will always be someone ready to help you out.
  • Comprehensiveness: KNIME allows you to perform a wide range of analyses and engage with many different machine learning models. You can even leverage one of those “large language models” that are so popular nowadays 😃 (see https://www.knime.com/blog/guide-to-build-your-own-LLM-solutions).
  • Free of charge: KNIME is completely free. It is important to highlight that despite its free availability, KNIME is a professional and powerful software, not a compromise.

If I caught your attention and you want to give KNIME a chance, you can download it for free here: https://www.knime.com/downloads. Although registration is not mandatory, I highly recommend it, as it allows you to access the resources of the Forum and to be an active part of the community.

And now let’s move on to…

…PageSpeed Insights

PageSpeed Insights (PSI) is an online tool provided for free by Google. It evaluates a website’s performance on mobile and desktop devices and makes suggestions to improve its loading speed and user experience.

You can try it out by visiting the dedicated website (https://pagespeed.web.dev/) and entering the URL of the website you wish to analyze. After a few seconds, PageSpeed Insights will provide a report with five KPIs (one for each category analyzed: Performance, Accessibility, Best practices, SEO, and Progressive Web App) and a list of suggestions to improve both the performance of the site and the user experience on the site. (The Performance analysis provides additional details with respect to the following metrics: First Contentful Paint, Largest Contentful Paint, Speed Index, Cumulative Layout Shift, Time to Interactive, and Total Blocking Time.)

One of the most interesting aspects of Google PSI is that it allows any site to be analyzed. Since domain ownership verification is not required, you can use PSI to study your competitors’ websites and benchmark them against your own.

To do this, we have two alternatives:

  • Go to the PSI website, enter the URLs to be analyzed (ours and those of the “competitors”) one by one and manually annotate the results on a spreadsheet (Excel or Google Sheet);
  • Or automate the process by asking KNIME to do the “heavy” work. That is, retrieving the performance of the websites we are interested in via the PageSpeed Insights API and then producing a nice benchmark report (see the figure below) with two Kiviat charts (aka Radar graphs): one for “Desktop” mode navigation, the other for “Mobile” mode navigation.
Benchmarking the performance of different websites with PageSpeed Insights and KNIME.

I, of course, chose the second path: it is more fun and, once the workflow is set up, I can re-run it with a single click. In addition, it allows me to focus on the data analysis phase, which is far more interesting and valuable.

The workflow has three sections:

  1. The first allows you to provide your API key in the String Configuration node, and use the Table Creator node to indicate the URLs of the websites to monitor.
  2. The second section loops over every website to monitor and uses the GET Request node to retrieve information from the PageSpeed Insights API. Additionally, the JSON Path and other data manipulation nodes are used to parse and process the retrieved information.
  3. The third one downloads the processed data locally (for any further analysis) and prepares it for on-screen visualization.

Before running the workflow (you can download it from the KNIME Community Hub at this link: https://hub.knime.com/-/spaces/-/~d1XTiU-e2x2qP5cn/current-state/) some clarifications are needed:

  • An API key is not required to try the PageSpeed Insights API. You will only need one if you plan to run more than two queries per second: you can get it here (https://developers.google.com/speed/docs/insights/v5/get-started) or, if you have a Google Developer account, by creating a project in Google Cloud (https://console.cloud.google.com/). The API key is required in the workflow.
  • The PageSpeed Insights API returns a JSON file that is so full of data that it can be disorienting. I have collected the main summary indicators (Performance, Accessibility, Best Practices, SEO and PWA) but, with some patience, you can import into KNIME all the other metrics scores provided by Google. For details of the HTTP request parameters and response structure see: https://developers.google.com/speed/docs/insights/v5/reference/pagespeedapi/runpagespeed.
  • The data retrieval step via PageSpeed Insights API is quite slow. So don’t worry if you see KNIME nodes executing a bit longer than usual (they are not running in circles!).
  • In the downloadable workflow, you will actually find two workflows:
    the one at the top (enclosed in the green frame) is the final one;
    below (in the light blue frame) I left a working draft version. It makes it easier to see some steps that are less intuitive in the final version. The draft version is also fully functional, just remember to enter your API key in the GET Request nodes labeled as “PSI — mobile” and “PSI — desktop”.

Before saying goodbye, I’d like to share a few useful links to learn more about this topic and customize the workflow to your liking:

--

--