What’s in the Google PageSpeed Score?

How Lighthouse calculates your score and how to use it

Csaba Palfi
Expedia Group Technology
7 min readJun 25, 2019

--

Google uses page speed in mobile search and ad ranking, and their Lighthouse tool provides the PageSpeed Insights (PSI) score to help you measure and boost your page’s performance. Let’s look under the hood at how it works.

A rare example of a perfect score

The PSI score is between 0 and 100 and is generated from a Lighthouse run. Google considers a speed score of 90–100 as fast, 50–89 as average, and 0–49 as slow. You can think of PSI as a hosted Lighthouse web interface (and API).

The 5 metrics that affect the score

The Lighthouse scoring documentation explains that the performance score is determined using the following estimated metrics:

  • First Contentful Paint (FCP): when the first text or image content is painted.
  • First Meaningful Paint (FMP): when the primary content of a page is visible.
  • Speed Index (SI): how quickly the contents of a page are visibly populated.
  • First CPU Idle (FCI): when the main thread first becomes quiet enough to handle input.
  • Time to Interactive (TTI): when the main thread and network is quiet for at least 5s.

None of the other Lighthouse audits have a direct impact on the score, but they do give hints on improving the metrics. To learn more about the metrics, check out my awesome-web-performance-metrics repo on Github:

Lantern is the part of Lighthouse that models page activity and simulates browser execution to estimate metrics.

Not all metrics are created equal

Lighthouse calculates a speed score for all 5 metrics based on their estimated values, then calculates a weighted average to get an aggregate speed score. The metric weights and fast/slow thresholds are available in the table below:

Metric weights and score thresholds

Scoring is not linear; rather, it is based on a log-normal distribution. For example, the scoring function for TTI looks like this.

Recommendations for using the score

  • Use the score to look for longer-term trends and identify big changes; but prefer your own analytics/field data for finer details
  • Metrics in the red usually highlight genuine problems, even though actual values are not 100% accurate
  • Reduce variability by doing multiple runs, forcing A/B test variants, and other means — but even with reduced variability, some inherent inaccuracies remain
  • Try the pagespeed-score node module to reduce/identify variability and to investigate inaccuracies

How does Lighthouse estimate metrics?

Lantern is the part of Lighthouse that estimates metrics by emulating mobile network speeds and CPU execution in a simulation. It has to trade some accuracy for execution speed, though this mitigates certain sources of variability. See the detailed accuracy and variability analysis made available by Google. Metrics can be over- or underestimated because of:

  • Differences in the unthrottled trace vs. the real device or throttling
  • Details ignored or simplified to make the simulation workable

Here’s a detailed breakdown of how Lantern works.

1. Create a page dependency graph

  • Lighthouse loads the page without any throttling (this usually is referred to as the observed trace)
  • Build a dependency graph based on the network records and the CPU trace
  • Link up CPU tasks and network requests related to each other

See lighthouse-core/computed/page-dependency-graph.js for the implementation of this process.

(via Project Lantern Overview — slide 7 by Patrick Hulce)

2. Create a subgraph for each metric

  • Identify nodes contributing to a metric by comparing node ending timestamps to their corresponding ending timestamps in unthrottled runs
  • Filter CPU and network nodes to create a subgraph containing only those nodes contributing to the delay of a specific metric
  • See lighthouse-core/computed/metrics/lantern-* for implementation of the subgraphs
(via Project Lantern Overview — slide 8 by Patrick Hulce)

3. Simulate subgraphs with emulated mobile conditions

(via Project Lantern Overview — slide 9 by Patrick Hulce)

The pagespeed-score module

pagespeed-score is a command line toolkit to get speed score and metrics from the Google PageSpeed Insights API (PSI API) or a local Lighthouse run. It’s available as a node module from npm and is easy to try with npx.

$ npx pagespeed-score https://www.google.com
name score FCP FMP SI FCI TTI
run 1 96 1.2 1.2 1.2 3.3 3.7

Use --help to see the list of all options or look at the code on Github:

Local mode

pagespeed-score --local switches to running Lighthouse locally instead of calling the PSI API. This can be useful for non-public URLs (e.g. staging environment on a private network) or debugging. To ensure the local results are close to the PSI API results, this module:

  • Uses the same version of LightHouse as PSI (5.0.0 as of 21 June 2019)
  • Uses the LightRider mobile config
  • Allows throttling of CPU with --cpu-slowdown, defaulting to 4x. Note that PSI infrastructure already runs on a slower CPU (like a mobile device), hence the need to slow the CPU down for local runs that usually happen on a more powerful device.
  • Permits use of the same Chrome version as PSI (76 as of 21 June 2019) by specifying CHROME_PATH:
CHROME_PATH="<path to Chrome Canary or Beta binary>" \
npx pagespeed-score --local "<url>"

Local results will still differ from the PSI API because of local hardware and network variability.

Reducing variability

The Lantern accuracy and variability analysis discusses a number of sources of variability. Some of them are already mitigated by PSI or Lantern, but as a user, you can also take steps to reduce variability of your scores and metrics even further.

Multiple runs

Test multiple times and take the median (or more/better statistics) of the score to reduce the impact of outliers, independently of what’s causing this variability.

You can use the pagespeed-score CLI:

  • --runs <N> overrides the number of runs from the default of 1. For more than 1 run, stats will be calculated.
$ npx pagespeed-score --runs 3 https://www.google.com
name score FCP FMP SI FCI TTI
run 1 96 0.9 1.0 1.2 3.1 3.9
run 2 96 0.9 1.0 1.0 3.1 3.7
run 3 95 0.9 1.0 1.2 3.5 4.0
median 96 0.9 1.0 1.2 3.1 3.9
stddev 0.6 0.0 0.0 0.1 0.2 0.2
min 95 0.9 1.0 1.0 3.1 3.7
max 96 0.9 1.0 1.2 3.5 4.0
  • --warmup-runs <N> add warmup runs that are excluded from stats (e.g., to allow content delivery networks or other caches to warm up)

Force A/B test variants

By making sure we always test the same variants of any A/B tests running on the page, we can ensure they don’t introduce page nondeterminism.

Use feature flags to turn off third party scripts

Sometimes third party scripts or other features on the page introduce variability. As a last resort, add a flag to turn these off to get a more stable score. Make sure not to rely exclusively on the score and metrics captured like this, as real users will still experience your page with all of these features turned on.

Identifying sources of variability

You can look at additional data points not directly taken into account for score calculation that can help in identifying sources of variability.

Benchmark index

Lighthouse computes a memory/CPU performance benchmark index to determine rough device class. Variability in this can help in identifying client hardware variability or client resource contention. These are less likely to occur with PSI, which uses a highly controlled lab environment and can affect local Lighthouse runs more.

You can use the pagespeed-score CLI to monitor this:

  • --benchmark adds the benchmark index as a metric for each test run

Time to first byte

Time to first byte (TTFB) has a very limited impact on the score, but can be a useful indicator of web server variability. TTFB is based on the observed/fast trace.

You can use the pagespeed-score CLI to monitor this:

  • --ttfb adds TTFB as a metric for each test run

User timing marks and measures

We use a number of user timing marks, and high variability in these can mean page nondeterminism or other sources of variability. Note user timing marks are not estimated by Lantern, but are instead based on the observed/fast trace.

You can use the pagespeed-score cli to monitor them:

  • --usertiming-marks.<alias>=<name> adds any User Timing mark named to your metrics with the name alias(e.g., --usertiming-marks.DPA=datepicker.active)

Identifying inaccuracies

By default Lighthouse estimates metrics. As with any estimation, there will be inaccuracies. Your page performance may have changed in the field even if your PSI score remains the same. You can dig deeper to understand these inaccuracies.

Debug Lantern metrics estimation locally

To understand how exactly Lantern estimated a metric, you can instruct Lighthouse to save the traces resulting from the simulations:

LANTERN_DEBUG=true npx lighthouse --save-assets <url>

Use the Chrome Devtools Performance tab to open the traces. Subscribe to lighthouse#5844 for updates on how Chrome can visualize trace data for you.

If you run pagespeed-score in local mode, it has built-in support for this and also ensures that your lighthouse setup is as close to PSI as possible:

CHROME_PATH="<path to Chrome Canary or Beta binary>" \
npx pagespeed-score --local --save-assets --lantern-debug "<url>"

Summary

The Google PageSpeed Insights score is a great tool for gaining a high-level overview of web application performance. Lighthouse is developed in the open, and you can look at how it works and contribute. It’s important to be aware of its limitations; we hope our recommendations above will help you make the most of your measurements. Let’s build a faster web together!

--

--