Badger VS BoltDB: Persistent key-value store written in Go

Introduction

1 month ago, my colleague Albert Widiatmoko introduced me to a key-value store written in Go called Badger. He gave me an article about introduction to Badger. Whoever has not read that article yet, I recommend you to go through that article first. It tells you a brief history about Badger and what is the concept of Badger. After reading that article, the idea to separate storing the key and value really got my attention.

On the other hand, there is also a pretty famous key-value store written in Go, and it’s BoltDB. At the moment, there is no benchmark to compare Badger and BoltDB. I decided to create a benchmark by creating a small wrapper service for Badger and BoltDB, with 2 gRPC endpoints, one is for read and the other one is for write.

The purpose of this experiment is to create a benchmark between Badger and BoltDB as persistent key-value store and I will use big value that can’t be stored sufficiently in memory.

Datesets

I create 2 datasets for this benchmark:

  1. 10,000 keys with value consists of 64 kB (small) and 2,932 kB (large) equally distributed. Equally distributed means that inserting small and large value alternately.
  2. 50,000 keys with value consists of 64 kB (small) and 2,932 kB (large) equally distributed.
PS: Badger spend < 8 minutes to populate the second dataset on my laptop, meanwhile BoltDB spend > 5 hours.

Experimental Setup

My first approach is that I do this benchmark on my laptop. My machine specifications is listed below.

4 Core Processor
 8 GB RAM
 SSD with 17.5k IOPS — I got this value using fio

There are 4 kinds of benchmark:
 1. BenchmarkGetSmall — Read a specific key with small (64 kB) value.
 2. BenchmarkGetLarge — Read a specific key with large (2932 kB) value.
 3. BenchmarkSetSmall — Write a random key with small value.
 4. BenchmarkSetLarge — Write a random key with large value.

I perform the benchmark to 3 targets: Badger with Default Options, Badger with LoadToRAM, BoltDB with Default Options. Benchmark is done by using “go test -bench .”.

PS: There is “MapTablesTo” option that can optimize Badger performance. Just simply change it to “table.LoadToRAM” to have better performance.

Experimental Results

Benchmark result with the first dataset (10,000 keys).

As it is shown in the result, BoltDB has the best read performance. But in terms of write, Badger is having big win over BoltDB. The other result is that “LoadToRAM” options give some performance bonus to Badger.

Benchmark result with the second dataset (50,000 keys).

As it is shown in the result, BoltDB still has the best read performance. BoltDB has pretty consistent read although more keys and value are stored. Same thing happens to Badger with LoadToRAM. Moreover, it is consistent in terms of write. BoltDB experience degradation of write performance with larger dataset.

Doing benchmark on laptop itself is not enough. I created an instance on DigitalOcean with machine specifications listed below.

8 Core Processor
16 GB RAM
SSD with 23.5k IOPS

Benchmark result with the first dataset (10,000 keys).
Benchmark result with the second dataset (50,000 keys).

Now, with faster SSD, it shows that Badger has win the small value read performance over BoltDB. For write operations, Badger still win a lot from BoltDB.

Conclusion

BoltDB is pretty fast in read operation. But Badger is not that bad too for read operation. On faster SSD, Badger performs really well and make the read performance difference with BoltDB become negligible. For write operations, Badger absolutely win over BoltDB. Badger is still pretty young and it has shown a good performance. Looking forward for the future of Badger and thanks to the dgraph team, the creator of Badger.