How we brought down storage costs for the Ads platform: Part 1
At ShareChat, our Ads platform allows advertisers to run all their marketing campaigns on our app and at scale. A common problem that occurs has been around the per-user frequency capping for an ad.
To achieve this, our code needs to do the comparison of the number of times a user has already seen an ad to the configured frequency in real-time and for that, we need storage of how many times a user has seen an ad. On the face of it, this problem looks simple enough but when we are solving at the scale of 160 million MAU, it gets trickier.
We could always throw more servers and storage at any solution to scale it but innovation happens when you try to be frugal.
We will be exploring the various solutions to this problem, the mistakes we did, and the final solution we implemented in this 2 part series:
- Part 1: Going through the different solutions, along with the pros and cons of each, we considered to solve this problem.
- Part 2: The final solution along with the implementation.
Specifications our end solution needs to satisfy
- Low Latency is of prime importance because we need very fast reads.
- A minimum of 500kops should be supported by the system.
- A minimum of 10k active ads at any given time.
- Close to 100% accuracy required but a complete 100% accuracy is not essential.
- Storage costs should be as minimum as possible.
Solution 1: Maintain user x ad view counts in a DB and fetch counts in real-time.
Pros:
- Good enough for a small use-case when you are just starting up and have DB drivers already existing.
Cons:
- Cannot scale due to the higher latencies involved with DB operations.
- Storing a huge amount of data involves huge storage costs too.
Solution 2: Keep Solution 1 in a cache like Redis using a simple key-value data structure.
Pros:
- Improves latency from Solution 1 as the data is now in cache storage.
- Still good enough for only small use-cases.
Cons:
- Storing a huge amount of data involves huge storage costs too.
- A single key takes 70 bytes. Even if 10 million users see a single ad, that will mean using up 700 MB of storage just for a single ad. We have to support more than 10k active ads.
Solution 3: Improve on Solution 2 by using Redis hashmaps to store per ad views instead of simple key-value.
Pros:
- Improves upon the storage usage from Solution 2 as Redis hashmaps use lesser storage than key-values: https://docs.redislabs.com/latest/ri/memory-optimizations/#convert-hashtable-to-ziplist-for-hashes
- Redis hashmaps have a constraint that after a specific number of items, the hashmaps are internally stored as simple key-values and lose the compression that the hashmaps have. As such, the storage costs improvement will only be there for a smaller scale as the number of active ads are dynamic and keep increasing and decreasing.
Cons:
- There is a possibility of huge storage costs in this solution too.
- Doesn’t actually improve much on Solution 2.
Solution 4: Storing the hashmap on the client-side instead of storing it on the server-side.
Pros:
- All the requirements of the spec are satisfied by this solution because the hashmap is passed to the backend by the client in each request. This results in low latency along with low storage costs and 100% accuracy.
Cons:
- Results in heavy client-side operations along with maintaining data in the app.
- Passing the entire payload on each request to the backend for ads and this payload could grow large based on the number of ads a user sees in a day. Also, this payload might need to be passed in multiple microservices on the backend.
Solution 5: Redis bitmaps for each user with blocks of bits representing each ad’s views.
Pros:
- All the requirements of the spec are satisfied by this solution with very minimum storage costs.
Cons:
- Operational overhead of removing deactivated campaigns to reclaim space.
- Works well for the specifications and the problem statement we have in Part 1 where we need frequency capping for the lifetime of the ad for a user. Once we need the frequency capping of the ad at the daily and hourly levels too, this solution also starts using up a huge amount of storage.
Finishing up
We just went through different solutions that we could use for the problem we have on our hands. All of these are valid solutions that could be used for different levels of scale. For example, solution 1 is the best when the scale is small and you are just starting up.
We’ll continue in part 2 where we explore a modified problem statement in which we want this frequency capping at an overall, daily, and hourly level for each ad. We will also explore the final implementation we went ahead with.
Any guesses as to what implementation we could have used? I’ll give you a hint - we used a probabilistic data structure.