Google Cloud Platform CPU Performance in the eyes of SPEC CPU® 2017 — Part 4

Federico Iezzi

Published in

Google Cloud - Community

5 min readFeb 10, 2023

In this last part of the series, I’d like to share where all my work has been recorded. So here we go:

[SPEC CPU® 2017][GCP][medium] - Feb 9th 2023

Credits Author,Federico Iezzi E-mail,fiezzi@google.com GitHub, https://github.com/m4r1k Twitter…

docs.google.com

I’m a big fan of recording as much data as possible and as any other good datahoarder would do, this sheet (the format is a bit questionable indeed) collects, well, everything. The spreadsheet is made of 19 sheets, each topic is color-matched, the names are (supposed to be) self-descriptive and, hmm, oh right, the color formatting took me an awful amount of time 🤣

Platforms Characteristics

As the name suggests, each machine type is categorized, with great details of the vNUMA topology (or at least, what’s exposed to the VM), the CPU ID (and for the geeks here, there are some neat details), and clock speeds (base clock, single-core and all-core turbos). You may notice the “Standard Benchmark type”. This is because, originally, I was also working on an XXL comparison taking the largest available machines, but, as hinted in previous posts, it was incredibly expensive to run (matter of thousands of dollars) and I decided to drop the idea/project.

Platform Price tag

Here we have a table collecting the hourly and monthly cost, per machine type, in the GCP region of The Netherlands (europe-west4).

$/performance

Among one of the most significant sheets available, for each machine type, you can find the monthly cost divided by the sum of integer and floating point results. This is for both Base and Peak results. Plotted version is also available.

Standard Rank

Geomean of all results (Single- and multi-thread for both Base and Peak), plotted, and per-core efficiency.

16T vs. 16C Rank

Another fundamental sheet with the comparison of the geomean for 16 cores vs. 16 threads for T2D, T2A, N2D (both Zen2 and Zen3), and N2 (both CLX and ICX).

C2D Milan vs. TAU (aka x86 vs. ARM)

As the name suggests, geomean for C2D, T2D (still Milan), and T2A Arm.

Per-core throughput

Another interesting sheet with the per-core throughput for each SPEC CPU2017 benchmark. The darker the green, the higher the efficiency, and the darker red, well, the lower. Yellow is, effectively, a middle ground.

SPEC CPU2017–1T — Base and Peak & SPEC CPU2017 — nT — Base and Peak

Four sheets with the benchmark results for the entire suite. You can find single-thread for both Base and Peak results and also multi-thread still Base and Peak.

N2/N2D/T2D/T2A — 16T vs. 16C

Entire comparison for each benchmark in the 16 threads vs. 16 cores.

Runtime and Cost

This is a fun one. It’s the sheet where I collected the entire runtime that PKB took to compile GCC and run SPEC CPU 2017 and also included the cost of the GCE part.

Couple of nice fun facts:

If you want to replicate this experiment, the cost will be around 500 USD (+PD-SSD and the PKB machine);
Something I didn’t publish (but worked on in early 2022), the cost for running SPEC on all the biggest GCE available is over 10 times the current one;
The entire run-time is just shy of 24 DAYS (but indeed all can and should run in parallel).

SPEC CPU2017 versions and flags

A simple sheet where GCC flags and versions are recorded.

Mitigations

Spectre? Meltdown?

Here’s how, and why, the Spectre and Meltdown patches will hurt performance

As a person that witnessed, I should add, first-handed, the massive performance impact (aka drop) of the early Spectre mitigations, I learned the hard way that when running a benchmark is quite fundamental to perform a report of such mitigations. These days I really like the Spectre & Meltdown Checker. Would be also nice (critical actually 🤣) to have at hand the CPU microcode but this is not exposed by the virtualization layer nor Google disclosure of it. So, more generally, this sheet collects a report for the speculative executions, how each machine is affected, and the mitigations applied.

PKB CLI

You were looking for the exact wrapper used for running the simulations? How about the script to prepare the PKB machine? And, how about a script to export the results? Well, there you go.

PKB Patches

As the name suggests, here we have all the applied patches to PKB as well as the report made upstream.

SPEC CPU2017 Config

To replicate my results, this is the other fundamental piece, you need the SPEC config files for x86 and ARM.

Standard Raw Results

Last but certainly not least, the raw output from the runcpu includes lots of details of the benchmarks and platforms. You will certainly notice that all the executions are flagged as “INVALID RUN”: ‘reportable’ flag not set during run. This is due to the lack of personalized config files. Unfortunately, there is a gigantic list of system information that is necessary when you plan to submit all these tests to SPEC for validation. That’s not my aim so I didn’t spend time here. Perhaps that’s an area of improvement for a future revisit of such results.

Conclusions

All good things must come to an end and I truly hope you enjoyed reading this work as much as I did writing it. I will follow up soon with more in-depth studies of SPEC on GCP and who knows, other cloud providers shortly. I’m currently already working on the C3 review/preview and hope to put it out as soon as the product is officially launched. Till then, stay well!

Google Cloud Platform CPU Performance in the eyes of SPEC CPU® 2017 — Part 4

[SPEC CPU® 2017][GCP][medium] - Feb 9th 2023

Credits Author,Federico Iezzi E-mail,fiezzi@google.com GitHub, https://github.com/m4r1k Twitter…

Written by Federico Iezzi