[Technical] Announcing psqueues
At Better, the engineering team is dedicated to building the best learning experience. This means our app should be fast, even when there are thousands of users accessing our system at the same time.
Caching plays an important role here. While there are pre-made solutions available, such as memcached, custom in-memory caches can provide many benefits (e.g., avoiding repeated deserialization) if they are implemented and used correctly.
To simplify building such caches, we started the psqueues project. Simon Meier wrote an initial version just before ZuriHac 2014. At the Hackathon, we received some very valuable contributions from Alex Sayers, Florian Hartwig, Julien Truffaut, Jussi Mäki, Rooslan S. Khayrov, Stanisław Findeisen and Utkarsh Upadhyay. After this group effort, Jasper Van der Jeugt took some time to finish everything up, and we can now present a complete package with a sensible API, meaningful benchmarks, and close to 100% test coverage.
The psqueues package provides Priority Search Queues in three different flavors.
- `OrdPSQ k p v`, which uses the `Ord k` instance to provide fast insertion, deletion and lookup. This implementation is based on Ralf Hinze’s “A Simple Implementation Technique for Priority Search Queues”. Hence, it is similar to the PSQueue library, although it is considerably faster and provides a slightly different API.
- `IntPSQ p v` is a far more efficient implementation. It fixes the key type to `Int` and uses a radix tree (like `IntMap`) with an additional min-heap property.
- `HashPSQ k p v` is a fairly straightforward extension of `IntPSQ` — it simply uses the keys’ hashes as indices in the `IntPSQ`. If there are any hash collisions, it uses an `OrdPSQ` to resolve those. The performance of this implementation is comparable to that of `IntPSQ`, but it is more widely applicable since the keys are not restricted to `Int`, but rather to any `Hashable` datatype.
Each of the three implementations provides the same API, so they can be used interchangeably. The benchmarks show how they perform relative to one another, and also compared to the other Priority Search Queue implementations on Hackage: PSQueue and fingertree-psqueue.
Building a datastructure such as an LRU Cache on top of a priority search
queue is very straightforward — as you can see in this implementation. Even though this is only a simple implementation of an LRU Cache, it already
performs well compared to the other implementations in that repository. Another important use case is the GHC IO Manager: we believe that it would profit by switching to `IntPSQ`.
We hope you will find this package useful and we look forward to releasing more projects to the Haskell community in the future.
The engineering team at Better