Books I Read In May

a) Mongo DB Applied Design Patterns by Rick Copeland
b) Beginning ASP.NET 4.5 Databases by​Sandeep Chanda and Damien Foggon

Books I Read in May (Part 1)
Conversation between Will and his therapist in the movie: Good Will Hunting
“Will: You surround yourself with all the wrong fucking books.

Prof: What are right fucking books? Will: Whatever blows your hair back …”

If you are new to MongoDB, MongoDB is the leading non relational, document oriented data store that promises to reinvent online databases. Relational concepts such as schema normalization, Referential integrity and structure are somewhat sacrificed in favour of scalability, speed and a more evolutionary data schema. I say “somewhat sacrificed” because a skilled MongoDB developer can find ways not to mimic the sacrificed qualities.

My first exposure to MongoDB was in the summer of 2013. I was hired to help develop a data analytics tool for a start-up here in Nigeria. By my observation, the project was discontinued mostly due to a knowledge gap between how we intended to use MongoDB and how it should have been used.

Since then, I had interest in the NoSQL movement and MongoDB mostly because I just didn’t get how it ought to be used, the confusion only grew when I discovered a lot of start-ups where using the technology and testimonies where being given about how great MongoDB is and how not so great it is.

Book “a)” showed me that using MongoDB is not only different from relational databases but also that deciding to use and using Mongo for any project whatsoever is an Art. The design patterns part of the book starts with asking the question “To Embed or Reference”, it will definitely whet your appetite.

The book explores in great detail, the following use cases. Operational Intelligence (Real Time Analytics), Ecommerce, Content Management Systems, Online Advertising Network, Social Networking and Online Gaming

Now, let’s explore some “Aha” moments in the Social Networking use case.
It’s very easy to view a social graph through the lens of a relational database…You know, you are friends with your friends (pun intended) and your friends are related to their friends and so on. In our social network, let’s define some things. What data should be in a user profile? We’ll store gender, age, interest, relationship status and so on. What types of updates are allowed? Status updates, photos, links and check-ins.

So, you are probably picturing a table with a foreign key relationship with another table. Let’s call the first table “Profile” and the second one “Followers”.

Profile contains basic user while followers contains users who follow a particular profile. Now, let’s complicate things a little bit by having a table called “Circles” that is related to the “Followers” table. The “Circles” table allows a user to view feeds from followers belonging to a particular circle.

The problem with the relational database approach is that as your network grows, it increasingly takes more time to load an individual user profile mostly because the query to do this will contain SQL joins. The problem with Join statements is that they require random seeks on disk. Ouch!

The Mongo way of doing this would be to create an independent collection called social.user
{
_id: ‘T4Y…AC’, // base64-encoded ObjectId
name: ‘Rick’,
profile: { … age, location, interests, etc. … },
followers: {
“T4Y…AD”: { name: ‘Jared’, circles: [ ‘python’, ‘authors’] },
“T4Y…AF”: { name: ‘Bernie’, circles: [ ‘python’ ] },
“T4Y…AI”: { name: ‘Meghan’, circles: [ ‘python’, ‘speakers’ ] },

],
circles: {
“10gen”: {
“T4Y…AD”: { name: ‘Jared’ },
“T4Y…AE”: { name: ‘Max’ },
“T4Y…AF”: { name: ‘Bernie’ },
“T4Y…AH”: { name: ‘Paul’ },
… },
…}
4 },
blocked: [‘gh1…0d’]
}
Because of the absence of Joins in MongoDB, it’s incredibly easy to load a user’s profile data in a single query. Also notice that instead of having a table (as in the relational database approach), we embed “circles” and “followers” data into a single collection making it easier to load user profile data at once then again as profile data increases, one can shard the collection so as to scale beyond a single replica set. The schema is optimized for read performance.

This example is rather contrived; I encourage you to get the book, it will definitely blow your hair back.

Science and technology may have brought us tools like MongoDB but its use is not centred on logic but art. I wish I had this book when I was building the data analytic tool.

PS: I eventually developed a working prototype of the data analytics tool sometime this year using Elastic search and Kibana that was before I got the the Mongo book.

Show your support

Clapping shows how much you appreciated Victor Obaitor’s story.