SQL vs. NoSQL: A Beginners Guide

Jonathan
7 min readFeb 9, 2018

--

So you’ve been coding for a while, and you just realized that your application can greatly benefit from a database to store persistent information — data that will remain available, even if your application goes offline. Lucky for us, there are plenty of options out there that let us do this sort of thing… but where to start?

There are two very popular database types available today, with some very promising newcomers that may be changing game in the coming years. If you’ve done any research on databases before landing here, you’ve probably already come across SQL and NoSQL (surprise, surprise). Let’s walk through the characteristics of each.

SQL

SQL databases have been around for a long… long time, and have proven their worth. SQL, standing for Structured Query Language, is actually just the language that these databases are written in. You’ll be using SQL to read and write data from these databases, but a better way to think about them conceptually, is relational databases. These databases are compromised of tables — which you can think of as something similar to an Excel spreadsheet, where each column is a specific field of data in your database, and each row is an entry in that table. We’re off to a great start because spreadsheets are a very nice way to keep things structured and organized, but we’re not done yet. Let’s take a look at a quick example to both point out a flaw and at the same time, see how it’s handled.

Let’s say we have a list of users. Each user has a first name, a last name, and an age. We’ll also give them a unique id (called a primary key)to ensure that there’s only one of them in our system. We’ll end up with a table that looks something like this:

Now, let’s say that each one of our users also has children that we need to keep track of as well. This immediately presents a problem since we’re not exactly sure how many children each of our users has. Additionally, we need to keep track of their names and ages as well. We could add more columns to our existing table, but without knowing exactly how many children each user has, we could end up making too many columns, or even worse — not enough. To address this, we’ll simply make another table to hold the children’s data, and make a reference back to the users table to show which user they belong to:

Bam! We’ve now got our Children table, that holds all the information about each child that we need. But notice that there’s an extra column on this table called ‘parent_id’. This is what’s known as a foreign key, and is a reference to the unique id (primary key) that we assigned each parent in the Users table. Using this technique, we have established a relation between these two tables and can now add as many users/children as we need to without worrying about things getting out of hand.

This of course is a very high level explanation (there exists full time jobs just to create and manage these systems — yikes!), but should drive the point home that this is a very scalable solution for holding all of your data in a very structured and organized way — something that enterprise level corporations may be interested in doing…

NoSQL

NoSQL databases are, as you have likely guessed, are the opposite of relational databases. There are a few different types of NoSQL databases out there, but we’ll be focusing on Document based databases (ex. MongoDB) in this article as they are one of the most popular options.

Let’s hop right in: instead of using tables like in a relational database, you’ll use documents. Documents? Glad you asked. Documents are essentially a representation of an object in programming. Let’s take our example above and see what those tables would look like as documents to drive the point home.

Users = [  {    "_key": 0,    "first_name": "John",    "last_name": "Doe",    "age": 24  },  {    "_key": 1,    "fist_name": "Jane",    "last_name": "Smith",    "age": 28  }]

If you’ve worked with JSON before, this should look familiar. Here we’ve still got the same information as in the first example, but now instead of conceptualizing the data as a spreadsheet, we’re conceptualizing it as an object (where each object in the array above is a document) that we might see when working with our code. This makes it immediately more intuitive to work with since ‘everything is an object’. It’s also a lot more flexible, since we’re not bound by columns and rows. For example, if we wanted to add each users children to the database like we did for the first example, we can simply add a new property to each document like so:

Users = [  {    "_key": 0,    "first_name": "John",    "last_name": "Doe",    "age": 24,    "children": [      {        "first_name": "Hannah",        "last_name": "Doe",        "age": 12,      },      {        "first_name": "Jimmy",        "last_name": "Doe",        "age": 7,      }    ]  },  {    "_key": 1,    "fist_name": "Jane",    "last_name": "Smith",    "age": 28,    "children": [      {        "first_name": "Frank",        "last_name": "Smith",        "age": 9,      },      {        "first_name": "Jenny",        "last_name": "Smith",        "age": 14,      },      {         "first_name": "Brandon",         "last_name": "Smith",         "age": 13,      }    ]  }]

Awesome! Now we’ve got all the same data, but organized in a different way of thinking. Also notice that now each child no longer has their own id — this is because the children object is now nested inside their respective parent, so there’s no longer a need for it.

Believe it or not, that is pretty much the gist of it! Conceptually, anyway…

That being said, let’s hop into the question you came here for!

Which Database to Use

It depends! I know — bogus answer, right? What I’m getting at is that each database type has its own merits and pitfalls. We’ll discuss a few of them briefly, but before making a decision on what database to go with, you should already know what you’re building and what you want your data to look like, and then pick the solution that works best for your needs — just as you would with any other technology for your application.

Now, this is by no means an exhaustive list of pros/cons — but are usually some of the main points that are hit when considering which type of database to use. That being said, I would encourage you to dive deeper into each database and even build some small applications using each one to see how they work!

Arguments for SQL (Relational Databases)

Structured Data

Relational databases are really nice when you have a lot of data that’s needs to be structured in a very specific way, with different sets of data that relate to each other in some way.

ACID compliance

Most SQL databases are ACID compliant. Hit the link for the full details, but the short and sweet version is that this is a set of rules that guarantees the integrity of your data — even if the power going to your servers is inadvertently cut off!

Joins

This is in regards to data retrieval. When developing a complex application, often times you’ll need to retrieve multiple sets of data held in different tables. This is easily accomplished by ‘joining’ two or more tables together, and grabbing all of the information you need at once. Currently, there’s no way to do this in a Document based NoSQL database — meaning that if you need to retrieve multiple sets of data, you’ll have to make multiple queries to your database!

Arguments for NoSQL

Flexible

If your data is not necessarily structured in a certain way, or if you have multiple sets of data that need to be structured in a way that a table could not easily support (ie: nested objects), NoSQL databases are a great option. They allow you the flexibility to structure your data in a way that makes the most sense for your application without needing to abide by the row/column format.

Fast

At the cost of ACID compliance, you gain significant speed! Not to say that SQL databases are slow, but by comparison NoSQL databases are much faster when you’re querying them. However, be careful when considering this point — as mentioned earlier, there’s no way to ‘join’ different sets of data, so if you find that your application needs this functionality, it may cost you the speed that you gain if you have to make lots of queries to your database to get the data you need.

Ease of Use

As you saw above, everything is represented as an object — making them very easy to reason about since we already know how objects work and are structured in programming. As a result, there’s very little learning curve in using these kinds of databases which is nice if you need to get up and running quickly.

Conclusion

To reiterate — each type of database (even those not mentioned here) has there pros and cons and will be better suited to different applications. Being familiar with how each type works will help you make decisions on which one is right for the task at hand, so get out there and build!

--

--