Mongodb! And bookmarks in it!

Akarsh Satija
AkarshSatija
Published in
3 min readAug 17, 2016

--

Hey guys,

First, unrelated little tidbit of advice.

More and more people are moving towards Mongo these days.

Do I like this shift?.. Not really.

I’ll always be a relational database kinda guy. Reason being — it really makes you think about your database schema changes before you make them. It makes you think about how adding one column, or one field, in your table will negatively/positively impact performance. Which brings me to an ODM approach for mongo as well, such as mongoose, and I highly recommend it.

Maybe you’re already on that approach, hence enforcing table structure, in which case, I probably just sound like an old nagging mother. If you aren’t, PLEASE move to Mongoose, or Mongo will come bite you in the a** in the future. Remember, relational databases are as large and popular as they are, for a reason. Its tried, and tested. So…

HAVE PROPER TABLE STRUCTURE!

Now that we have that out of the way, lets talk bookmarks/other relations.

We’re conflating the issue of bookmarks by you adding “other relations” in there.

So, funny story, we have the bookmarks problem since beginning of the time, so let me walk you through our iterations there.

I was a user with a few 100 bookmarks, so clicking on the bookmark tab would take about 1–3 seconds to load for me. But A was a super user with 20,000+ bookmarks, and his tab would take anywhere between 10–30 seconds to load, which was terrible UX.

How did we setup?

User table (unique userID) <-> Many to many relationship to user_2_bookmarks table (foreign key of userID).

Since in a logged in flow, you always have the user ID, clicking on the bookmark tab would bring up all rows from the user_2_bookmarks table.

What did we store in the user_2_bookmarks table?

Relation to actual bookmark (so, another query here).

This approach worked fine for say, a thousand bookmarks, but broke beyond that.

How did we solve it?

In comes the caching layer, and a pretty big one at that. AWS gives you this beautiful functionality called elasticcache. A caching layer that they can scale up and down as needed, and uses a broker of your choosing. In it we threw a bookmark_to_user collection, containing the userID <-> Snapshot of bookmark. Where snapshot of bookmark was all the data that was needed to display the actual bookmark, i.e the title, the URL to the event, the photo, the date (if any, or no date for editorial content).

This made loading of bookmarks lightning fast.

So are you saying I throw everything in to a caching layer?

FU*K NO. That’s a common misconception, and pretty soon you’ll end up paying out of your ass for the caching layer if you take that approach.

Instead, figure out the most read heavy parts of your application. This is something that will happen over time. After figuring that out, and figuring out to cache those objects, you’ll have your answer. Remember, caching immediately adds a ton of complexity to your app because you start dealing with issues such as cache invalidation, AWS dropping the ball, stale caches, etc.

Every time a user updates a bookmark, or adds in a new one, you need to update cache. So, please use a cache wisely. And also remember, memory is more expensive than storage, so obviously, caching will be much more expensive.

Also, if someone offers to move you to RDS, slap them.

RDS is a beautiful piece of technology that you don’t need till you start feeling pains where you need to hire a DBA. The day you get there, move to RDS and pay out of your ass, till then, manage your own DB.

Let me know if the above makes sense or for any other help as well.

Happy scaling…

#Peace

--

--

Akarsh Satija
AkarshSatija

At the end of the day it is only the ‘I’ that shall strive for betterment. It is only the ‘I’ that shall attempt to achieve and overcome