From 1000 Queries to 5 — How We Scaled Our Commenting System
If you’ve ever stopped by HODINKEE, one thing you might have noticed is that we have our own in-house commenting system. In the past, we’ve used Disqus, which is a great solution for small blogs that don’t want to worry about hosting or building their own commenting infrastructure, but we really wanted to push things to the next level. It may seem like one of those “solved problems,” but we wanted comments to be tightly integrated to the rest of the site, and the best way to do that was to build it ourselves. Back in January of 2017 we launched member profiles and the commenting system simultaneously, and haven’t really looked back.
I’ll be honest. When working on the project, I imagined most articles would get a small handful of comments. Maybe 10 or 20 max. This assumption was based on the number of comments we received via Disqus. Even some of our most popular articles in our Talking Watches series only had around 30 comments. In no way did I expect to be receiving nearly 300 comments on a single article.
While almost 300 comments is a bit of an outlier (the average number of comments per article on HODINKEE is 32, by the way), the fact that it was taking multiple seconds to load all of the comments irked me. Multiple seconds in internet time is basically an eternity. I’ve dealt with scaling issues for all of my professional career, and I’m convinced that no one has 100% conquered them. Scaling issues happen both when and where you least expect. Unfortunately, more important projects kept popping up, so fixing the problem got put on the back burner for a couple of months.
Why Loading Comments Sucks
Rolling your own comment system at first glance is absolutely one of those “how hard could it be” problems (alternatively, “hold my 🍺 and watch this”). It’s a simple tree of data. Each tree node is a comment, and all of its children are its replies. You can cache each tree node, and when a node updates, invalidate the cache for itself and all of its ancestors. Boom, Russian doll caching. Done.
“But what about limiting the max depth of a comment?”
Well, then we can do a little extra processing when building the cache tree, but in the end, each node is extremely cacheable. It’s not like comments are moving around in this tree. Once they’re created, they stay in the same spot.
“But what about giving users the ability to like a comment?”
Okay… this is going to make caching a bit more complicated. The UI needs to be able to indicate when you have voted on a comment, which means that now every node in this comment tree is unique to you. Now we have to cache each node not only on the comment, but also on your user. This means that every user that visits the site gets their own cache key for each comment. It sucks, but at least only the first load will be super slow, and all logged out users can share the same cache.
“You should be able to view your own pending comments though.”
Yeesh, well we can still use the per-user cache for that one. That’s a little extra processing on the backend, though.
“If a comment is deleted, it should no longer be shown unless it has replies. In that case, the comment content should be hidden.”
And that’s not even all of the things we need to consider when building out a comment response for an article. The fact that it is highly customized to each user presents an interesting and non-trivial problem in scaling. Not only do we have to load each comment from the database, but every comment has a user (separate database table), and every user has a set of blocked users (separate database table), and every comment has a number of likes (separate database table). If you’re a developer you might be thinking that sounds like an N+1 nightmare, and you’d be right. If you’re not a developer, that basically means that for every comment we load from our database, there is more data we must load separately for each comment. By that logic, loading 300 comments can quickly turn into 1000 database queries.
How Do You Even Tackle Something Like This?
All of our hopes and dreams of quick and easy cacheability have basically been thrown out of the proverbial window. How do you cache something that is different on a per-user basis? We’re definitely not within the scope of convention for the way our object serializers expect things to work. We need to solve not only the nasty N+1 query problem, but also the per-user cache problem.
After thinking about it for awhile, an interesting solution dawned on me. What if we went ahead and cached the entire comment tree with total disregard to the current user, and then we “decorate” it with per-user tweaks post-cache fetch? This effectively splits loading comments into two separate phases: the fetch phase and the decorate phase. The result of the fetch phase is 100% cacheable. The decorate phase is not, but can be highly optimized.
Fetch Phase
Remember that N+1 query problem I mentioned? We tackle a huge part of that here. From an article, we can easily get all of its comments in a single query, but unfortunately we lose the tree structure we need. We use the ancestry gem (which is amazing, by the way!) to handle our comment organization. One thing that’s important to know before going forward is that we store the comment tree in the database without taking into consideration the max nesting depth. This lets us tweak that max depth at any time, and preserves a much simpler and pure data structure in the database.
The ancestry gem comes with a few built-in methods that allow you to fetch an entire tree in a single database query and arrange it after the fact, which is the first approach I tried. It basically looked like this:
While this solves the N+1 query issue for fetching comments and users (we call them members), it still leaves a lot of boxes unchecked. The biggest box, in my opinion, is that it does not take into account the maximum nesting depth (which is 2, by the way) before we convert a comment to a “reply to” style comment. This moves the comment up to the maximum nesting depth in the comment tree, and identifies the user that we’re replying to above the comment. We could handle that after the fact, but that gets messy pretty quickly.
Instead of relying on any pre-built methods from ancestry, I opted to build the comment tree myself. We can fetch all of the comments and all of the members in one go (which generates 2 total database queries) and work from there. First up, the data structure:
Now that we have a data structure to work with, we can begin processing each comment. We iterate over each comment and build all of the parent/child relationships, taking into account the max nesting depth.
After all that work, we now know how each comment is related to each other, but we still have them in a flat data structure. The next step is converting this to a new tree data structure.
Awesome, now we have a tree of comments that is formatted properly and only needed 2 database queries to load. Now that we have this tree structure, we can pass things over to our serializers. The serializers are responsible for converting the comment objects to the JSON structure we serve from our API. The serializer code is pretty trivial, so I won’t bother including it here. One thing we do, however, is cache the entire comment tree, which is awesome. We have to manually invalidate the full tree cache at times, but I can live with that. We also cache each comment individually in the serializer. Remember that nothing we’ve done so far is specific to any user.
Decorate Phase
Phew, how are you doing? Still with us? We’ve covered a lot of ground so far, all just so that we can fetch some comments efficiently. To recap, we now have a fully serialized comment tree, but it’s missing some important tweaks.
If you remember from earlier, we have some requirements regarding the handling of pending and deleted comments. Deleted comments are not returned unless they have replies, in which case they are returned but their contents are removed. The same goes for pending comments, unless they’re your own, in which case they’re always returned and with content. We do some quick filtering on the comment tree to handle this logic. The decorate method is recursive so this happens for every level in the tree.
Once we prune away all the comments we don’t need, we can add the last bits of required data to each comment. This is the fun part where we tackle multiple N+1 query problems that we had previously. Specifically, checking whether the current user has liked a comment and whether the current user has blocked the poster of a comment. Since we are processing the whole comment tree at once, we can pre-fetch all of this data. And to make things even better, we only need to fetch IDs from the database instead of building full ActiveRecord objects!
Now checking whether a user is blocked or or has liked a comment is a simple array lookup, and it only took 3 database queries to grab all the data we need. We can finish up the decorate phase by putting this data to good use.
The Results
And that’s it! We now have a full serialized comment tree that meets all of our requirements, and only took 5 database queries instead of 1000. To see the effect this had on the comments API endpoint, take a look at a little before and after.