Stop using UUID v4 as a primary key
In the days of distributed systems UUID became really popular as it allows to generate unique identifiers in a decentralised manner. The current specification describes 5 UUID versions:
- v1 uses computer’s MAC address to generate the label
- v2 users more “static” information and have a limited space for randomness. Only 64 unique UUIDs might be generated within ~7 minutes
- v3 and v5 are deterministic and based on the supplied input
- v4 has only 6 static bits for version (4 bits) / variant (2 bits) and the rest 122 are random
So v4 became the most popular version. However, one of the main downsides of using UUID v4 as a primary key is the way database indexes are implemented. Let’s see the cause of the issue and what can we do to address it.
Problem statement
Most databases use B-tree for indexes. It is a self-balancing tree data structure that allows to perform operations like insert, search and delete in a logarithmic time.
The indexes are usually huge and do not fit into memory entirely. So, we have only a subset of index nodes cached in memory.
When monotonic values are inserted into index, they went to the right-side nodes of that tree. Most likely that nodes are cached…