Member-only story

Applied C++: Memory Latency

Benchmarking Kaby Lake and Haswell memory latency using lists

Andriy Berestovskyy
Applied
11 min readApr 9, 2019

--

Modern CPUs are complex beasts with billions of transistors. This complexity in hardware brings indeterminacy even in simple software algorithms.

Let’s benchmark a simple list traversal. An average node access latency corresponds to CPU cache latencies. Or does it? Let’s put it to the test!

“Do I Know This Already?” Quiz

Benchmarking access latency for lists with a different number of nodes. All the lists are contiguous in memory, traversed sequentially, and have a 4 KB padding between the next pointers.

If we plot on the x-axis the working sets of benchmark iterations, and on the y-axis the node access latencies, how the graph will look on a modern CPU?

  1. The latency graph will have a few steps matching CPU caches specs.
  2. The graph will be steeper, and the latency much higher than the specs.

--

--

Applied
Applied

Published in Applied

Articles on software performance and IP networking

Andriy Berestovskyy
Andriy Berestovskyy

Written by Andriy Berestovskyy

Engineer at DFINITY.org. In love with software performance and distributed systems. Let’s connect on linkedin.com/in/berestovskyy

Responses (2)