One Way to Store Money in PostgreSQL Database and A Benchmark against Its Alternative

Consider using integers+magnifier if you want to save some 60% of query time!

Jason Lui

Published in

Xendit Engineering

4 min readMar 1, 2022

Synopsis

To store money in PostgreSQL, use integers with magnifier if you know the finest denomination, use numeric if you don’t, or if you can tolerate its performance;
PostgreSQL aggregation on numeric type takes around 60% more time than integer types.

Money is at the heart of everything we do here at Xendit. Being a scrupulous and customer-centric organization, we want our storage solution for those fractional values to be lossless and performant. Hence, we spend considerable effort on our PostgreSQL database schema design.

Numeric? Integer?

Let’s suppose we need to store transactions with their monetary values and currencies. Since we can’t tolerate any rounding issue, the first data type that comes into mind is naturally numeric, which makes the table look something like this:

A transaction with US$13.67 will be stored as one transaction record, with amount being “13.67” and currency being “USD”.

Problem solved? We don’t think so, because the official document says:

(for numeric) The actual storage requirement is two bytes for each group of four decimal digits, plus three to eight bytes overhead.

Looks like it can take up more space than fixed-size integers for numbers with more digits. We may get hit by this since we deal with IDR! (FYI, at the time of writing, 1 USD is around 14,300 IDR)

However, calculations on numeric values are very slow compared to the integer types, or to the floating-point types described in the next section.

Does not sound appealing!

The other solution is integer (or big integer) with magnifiers, and the tables would resemble this.

The same transaction with US$13.67 will be stored as one transaction record with magnified_amount being “1367” and currency being “USD”, along with a record in magnifiers with currency being “USD” and magnifier being “100”, as illustrated below.

In simpler terms, we are storing this as 1367 cents rather than 13.67 dollars.

Apparently, this isn’t perfect either, the range is bounded (although it’s usually big enough), and more importantly, the finest denomination is sort of locked by the magnifier, e.g., to make the above table support mills, we’d have to carry out a migration. Plus, developers need to always remember the value is magnified (which is why we name the field magnified_amount as a reminder).

But it is faster.

Well, how much faster?

That is the question we need to answer to decide whether it’s worth trading the straightforward and shackle-free numeric implementation. Off to some benchmarking we go!

Setup

Hardware:

MacBook Pro 2019 16-inch
Processor: 2.4 GHz 8-Core Intel Core i9
Memory: 16 GB 2667 MHz DDR4

OS: macOS Monterey version 12.1

PostgreSQL version: 13 (postgres:13-alpine docker image, Docker Desktop 4.4.2 (73305) for Mac)

Seeding script:

CREATE TABLE IF NOT EXISTS numbers
(
    id             BIGSERIAL constraint numbers_pk primary key,
    integer        INT            NOT NULL,
    biginteger     BIGINT         NOT NULL GENERATED ALWAYS AS ( integer ) STORED,
    numeric_free   NUMERIC        NOT NULL GENERATED ALWAYS AS ( integer ) STORED,
    numeric_coarse NUMERIC(19)    NOT NULL GENERATED ALWAYS AS ( integer ) STORED,
    numeric_fine   NUMERIC(19, 5) NOT NULL GENERATED ALWAYS AS ( integer ) STORED
);INSERT INTO numbers (integer)
SELECT RANDOM() * 10000 :: int
FROM generate_series(1, 10000000);

This creates a table numbers and inserts 10 million records of a random number stored in five different data types we want to compare.

Tests & Execution

I used two sets of test queries to avoid bias from a single aggregate:

SELECT SUM(integer) FROM numbers;
SELECT SUM(biginteger) FROM numbers;
SELECT SUM(numeric_free) FROM numbers;
SELECT SUM(numeric_coarse) FROM numbers;
SELECT SUM(numeric_fine) FROM numbers;SELECT AVG(integer) FROM numbers;
SELECT AVG(biginteger) FROM numbers;
SELECT AVG(numeric_free) FROM numbers;
SELECT AVG(numeric_coarse) FROM numbers;
SELECT AVG(numeric_fine) FROM numbers;

To avoid possible caching, I followed different execution orders for the two aggregates. The five “SUM” queries were run in order of INTEGER > BIGINTEGER > NUMERIC > NUMERIC(19) > NUMERIC(19,5) and repeated 10 times, while the “AVG” query was run for INTEGER 10 times, followed by for BIG INTEGER 10 times, then for NUMERIC 10 times, before for NUMERIC(19) 10 times, finally for NUMERIC(19,5) 10 times.

Results

The result was as follows (average query execution time in milliseconds).

From this, we learned that numeric types, regardless of precision and scale, were significantly (about 50% to 70%) slower than integer implementations.

Based on this, we adopted the integer+magnifier approach.

Hope it helps!