Simple Bloom Filter in C#
Question
What would a simple Bloom Filter look like in C#?
Answer
Details
Bloom filters can tell you if you definitely haven’t seen something before. They can also tell you “maybe you’ve seen this before, but feel free to check for sure.”
This can come in handy if you don’t want to cache every response to a request. Instead, you can ask the Bloom filter if you’ve seen this request before, and if you have, cache this response this time. This data might be popular for a while.
Obviously, there’s no framework in today’s example. Most examples you see of a Bloom Filter involve multiple hashes (a good idea), some level of abstraction (also a good idea), and some opinions about how you’d incorporate it into your overall project.
The problem is that these examples can be a little opaque and forget to explain what the filter is doing. The above lines are an x-ray-vision walkthrough written in C#. Feel free to bring your own extras.
Nerdier details
I originally set the filter size to 20 when writing this post. But, during a review, a friend of mine reminded me that the size of the filter array isn’t arbitrary. Specifically, since we’re modding by the size of a byte (8), we need to use an array size that is coprime with 8 (where the gretest common divsior is 1). This ensures that all the bits can be used and reduces your chance of collisions.
Originally published at blog.matthewd.co on July 21, 2017.
