MongoDB Joins Across Collections (with Ruby)

‘Join’ across Mongo collections using simple Ruby.

MongoDB is sweet, flexible and (when used correctly) enables insane productivity. Sadly, it does not support DB-side joins for standard calls. If you want to model your data in a traditional ‘relational’ way (which I posit is the best way, even with MongoDB) it can be irksome/error-prone to query joins on the app-side.

Some people consider this a downside of Mongo; I hope to show we can abstract that away with minimal DRY logic.

Normalized Models are Good; Joining is Good.

In SQL, this would be a ‘join’ query. Using ActiveRecord or whatnot, this might be expressed through the ORM, resulting in a single DB-side call. In MongoDB this is impossible, so we must make the ‘join’ on the app-side.

Since ‘joining’ across collections is a common pattern, we can and should DRY it with a common method. We will do this in Ruby-land, and discuss the implications later.

Real-World Example

We simply find the user by his id, assume the foreign key is “users” without the last char, and then find the relevant posts by user_id and likewise for messages. We then return the user, posts, messages.

This will result in n DB calls (per the number of collections) as the ‘join’ is made app-side. This is a performance hit. However, I posit that in many apps and most access patterns, the cost is negligible. Mongo is fast, humans are slow — optimize accordingly.

This generic access pattern can be used for ‘joining’ across any Mongo collections. Obviously here we assume the naming of collections and keys follow the users->user_id convention, and one might be advised to allow for some further configuration such as limits, custom foreign keys, criteria/projections, wiring to your ODM of choice, etc. You should also index the keys on which you query (in this case, user_id in posts and messages).

This is just minimal vanilla Ruby over the native MongoDB driver: it can be easily extended or modified to fit your exact flavors. However, for the general access pattern of joining across a foreign key, this pattern is very useful and DRY.

Notes

  • The above is of course an alternative to holding denormalized models, which I find in practice to be error-prone; Cache invalidation is famously difficult.

Summary

Full-Stack, Ruby, and JavaScript Consulant @ US, UK, IL. If you enjoy my work or are looking for a consultant, reach out at http://sellarafaeli.com

Full-Stack, Ruby, and JavaScript Consulant @ US, UK, IL. If you enjoy my work or are looking for a consultant, reach out at http://sellarafaeli.com