Database Pages — A deep dive
Databases often use fixed-size pages to store data. Tables, collections, rows, columns, indexes, sequences, documents and more eventually end up as bytes in a page. This way the storage engine can be separated from the database frontend responsible for data format and API. Moreover, this makes it easier to read, write or cache data when everything is a page.
Here is an example of SQL Server page layout.
In this article I explore the idea of a database page, how it is read and written to disk, how they are stored on disk, and finally I go through an example of page layout in Postgres.
A Pool of Pages
Databases read and write in pages. When you read a row from a table, the database finds the page where the row lives and identifies the file and offset where the page is located on disk. The database then asks the OS to read from the file on the particular offset for the length of the page. The OS checks its filesystem cache and if the required data isn’t there, the OS issues the read and pulls the page in memory for the database to consume.
The database allocates a pool of memory, often called shared or buffer pool. Pages read from disk are placed in the buffer pool. Once a page is in the buffer pool, not only we get access to the requested row but also other rows in the page too…