What makes MySQL LRU cache scan resistant

Disk reads are 4x (for SSD) to 80x (for magnetic disk) slower as compared to main memory (RAM) reads and hence it becomes extremely important for a database to utilize main memory as much as it can, and be super-performant while keeping its latencies to a bare minimum. Engines cannot simply replace disks with RAM because of volatility and cost, hence it needs to strike a balance between the two — maximize main-memory utilization and minimize the disk access.

The database engine virtually splits the data files into pages. A page is a unit which represents how much data the engine transfers at any one time between the disk (the data files) and the main memory. It is usually a few kilobytes 4KB, 8KB, 16KB, 32KB, etc. and is configurable via engine parameters. Because of its bulky size, a page can hold one or multiple rows of a table depending on how much data is in each row i.e. the length of the row.

Locality of reference

Spatial Locality of Reference

Having a larger page size addresses this situation to some extent. As one page could fit multiple rows, this means when that page is cached in main memory, the engine saves a disk read if the neighbouring rows residing on the same page are accessed.

Another way to address this situation is to read-ahead pages that are very likely to be accessed in the future and keep them available in the main memory. This way if the read-ahead pages are referenced, the engine needs to go to the disk to fetch the page, rather it will find the page residing in the main memory and thus saving a bunch of disk reads.

Temporal Locality of Reference

Caching exploits this behaviour by putting every single page accessed from the disk into main-memory (cache). Hence the next time a page which is available in the cache is referenced, the engine need not make a disk read to get the page, rather it could reference it from the cache directly, again saving a disk read.

Since the cache is very costly, it is in magnitude smaller in capacity than the disk. It can only hold some fixed number of pages which means the cache suffers from the problem of getting full very quickly. Once the cache gets full, the engine needs to evict an old page so that the new page, which according to the temporal locality of reference is going to be accessed in the near future, could get a place in the cache.

The most common strategy that decides the page that will be evicted from the cache is the Least Recently Used cache eviction strategy. This strategy uses Temporal Locality of Reference to the core and hence evicts the page which was not accessed the longest, thus maximizing the time the most-recently accessed pages are held in the cache.

LRU Cache

The one end (head) of the list holds the most-recently referenced page while the fag end (tail) of the list holds the least-recently referenced one. A new page, being most-recently accessed, is always added at the head of the list while the eviction happens at the tail. If a page from the cache is referenced again, it is moved to the head of the list as it is now the most-recently referenced.

Implementation

InnoDB’s Buffer Pool

A notorious problem with Sequential Scans

What would happen if an entire table is scanned? say, while talking a DB dump, or running a SELECT without WHERE to perform some statistical computations.

Going by the MySQL’s aforementioned behaviour, the engine iterates on all the pages and since each page which is accessed now is the most recent one, it puts it at the head of the cache while evicting one from the tail.

If the table is bigger than the cache, this process will wipe out the entire cache and fill it with the pages from just one table. If these pages are not referenced again, this is a total loss and performance of the database takes a hit. The performance will pickup once these pages are evicted from the cache and other pages make an entry.

Midpoint Insertion Strategy

Instead of treating the Buffer Pool as a single doubly-linked list, it treats it as a combination of two smaller sublists — usually 5/8th and 3/8th of the total size. One sublist holds the younger data while the other one holds the older data. The head of the Young sublist holds the most recent pages and the recency decreases as we reach the tail of the Old sublist.

Eviction

Insertion

By inserting in the middle, the pages that are only read once, such as during a full table scan, can be aged out of the Buffer Pool sooner than with a strict LRU algorithm.

Moving page from Old to the Young sublist

If the page is referenced the second time it is moved to the head of Young sublist and hence stays in the cache for a longer time. If the page, after being inserted in the middle, is never referenced again (during full scans), it is evicted sooner because the Old sublist is usually shorter than the Young sublist.

The Young sublist thus remains unaffected by table scans bringing in new blocks that might or might not be accessed afterwards. The engine thus remains performant as more frequently accessed pages continue to remain in the cache (Young sublist).

MySQL parameter to tune the midpoint

In order to get greater insights about Buffer Pool we can invoke the following command as

The command SHOW ENGINE INNODB STATUS outputs a lot of interesting metrics but the most interesting and critical ones, w.r.t Memory and Buffer Pool, are

  • number of pages that were made young
  • rate of eviction without access
  • cache hit ratio
  • read ahead rate

Conclusion

This article was originally published on my blog — What makes MySQL LRU cache scan resistant

Other articles you might like

If you liked what you read, subscribe to my newsletter and get the post delivered directly to your inbox and give me a shout-out @arpit_bhayani.

Principal Engineer @Unacademy • Data @amazon • Platform @practo | Writes about Language internals, System Design, Programming and Math in Computer Science.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store