Postgres Randomized Primary Keys

There are times in your life when you might be asked to randomize the primary keys of a table so that they are numeric, unique, but non-sequential. If you been assigned this task, then this post is for you ❤.

Jul 12, 2017 · 4 min read
Image for post
Image for post

Numeric, Unique, but Non-Sequential?

That’s correct. While typical primary keys are just simple sequences of integers (1,2,3,4,5,6,…), this default is problematic because it:

  1. Makes it easy for users to guess random IDs

Enter the Feistel Cipher

Feistel ciphers are a type of block cipher that are symmetric and can be written in such a way that they:

What About My Existing Records?

Most of the information you’ll find online about this topic does not take existing records into consideration. If you already have a bunch of records that were generated using sequential primary keys, you would be at risk of generating a duplicate ID, potentially making it impossible for users to generate new records.

The Postgres Custom Functions

Before we can use our handy random ID generating cipher, we need to register it as a function within Postgres. In this case we are actually going to register two separate functions. One that generates the “random” ID, and the other that ensures we respect the min/max bounds. These functions are written using SQL Procedural Language (PL/pgSQL) which you can read more about here.

CREATE FUNCTION pseudo_encrypt_24(VALUE int) returns int AS $$
l1 int;
l2 int;
r1 int;
r2 int;
i int:=0;
l1:= (VALUE >> 12) & (4096-1);
r1:= VALUE & (4096-1);
l2 := r1;
r2 := l1 # ((((1366 * r1 + 150889) % 714025) / 714025.0) * (4096-1))::int;
l1 := l2;
r1 := r2;
i := i + 1;
RETURN ((l1 << 12) + r1);
$$ LANGUAGE plpgsql strict immutable;
CREATE FUNCTION bounded_pseudo_encrypt(VALUE int, MAX int, MIN int) returns int AS $$
VALUE := pseudo_encrypt_24(VALUE);
$$ LANGUAGE plpgsql strict immutable;
DROP FUNCTION bounded_pseudo_encrypt(VALUE int, MAX int, MIN int);
DROP FUNCTION pseudo_encrypt_24(VALUE int);

Ensuring They Work

With our functions registered, we can easily test them by generating a sequence and passing it through our shiny new functions.

-- Returns 50 rows
SELECT n, bounded_pseudo_encrypt(n, 50000, 10000) FROM generate_series(1,50) n;
-- Returns a single row
select bounded_pseudo_encrypt(1, 16777215, 2);
Image for post
Image for post
Notice how the bounded_pseudo_encrypt values are respectful of our max (50k) and min (10k) values. Running this query again should result in the exact same results, ensuring we never have any duplicate values.

Updating Your Primary Key Column

The final thing we need to do is alter our table so that the primary key column uses our custom functions to generate IDs. This is actually pretty easy — you just need to run the following query:

-- Assuming you already have 1000 existing records and want the max (16,777,215) number of IDs...ALTER TABLE your_table ALTER COLUMN id SET DEFAULT bounded_pseudo_encrypt(nextval('your_table_your_table_id_seq')::int, 16777215, 1000);

That’s All Folks!


Written by

Situated at @closingfolders as the VP of Engineering. I enjoy most things involved with the internet, up and down the stack.


Written by

Situated at @closingfolders as the VP of Engineering. I enjoy most things involved with the internet, up and down the stack.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch

Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore

Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store