How to Choose the Right Database for Your App

Making those database decisions a little easier and a little wiser

devDeejay
devDeejay
Dec 30, 2020 · 9 min read
Analytics being shown on a screen.
Analytics being shown on a screen.

Working on new projects is always super exciting — we have the freedom to design and architect the things any way we want. But this planning, when not done properly, can cause us a lot of pain in the future.

Choosing your app database is one of the critical decisions you’ll have to make, and with this article, I intend to introduce you to various database options — along with listing some pros and cons to help you make wiser database decisions.

KeyValue

Our database is structured just like a JSON object — every key is unique, and each key points to some value.

The key user:73:name points to the user Dhananjay Trivedi, for example.
The key user:73:name points to the user Dhananjay Trivedi, for example.

It keeps the data in the memory, which is very fast but comes with a capacity limit, so you can’t store a huge amount of data. And since there’s no disk involved, everything is blazing fast.

There are no queries or joins involved, so there isn’t much data modelling to worry about. Since there’s no schema, the developers always have the flexibility to change data as they may like.

Pro/con chart. Pros: Blazing fast, simple, flexible. Cons: Not suitable for large complex data.
Pro/con chart. Pros: Blazing fast, simple, flexible. Cons: Not suitable for large complex data.

When to use this technique

  • This technique is mostly used as a caching mechanism for when some part of the data is being very frequently fetched and observed
  • Hence, the key–value technique is widely used along with other databases as a caching mechanism
Redis and Memcached are examples of databases that use this technique.
Redis and Memcached are examples of databases that use this technique.

Wide Column

This is like key–value but on steroids. The value is modified to store a set of columns instead of just plain data.

The value is modified to store a set of columns instead of just plain data.
The value is modified to store a set of columns instead of just plain data.

With the introduction of columns, you can now group your related data, but still, there’s no standard schema in place. Hence, each of the keys can point to differently grouped data.

Since there’s no schema, it can handle unstructured data and comes with a query language called CQL, which is similar to SQL but way less powerful.

Data comes in a continuous stream, like from an IoT device, stock market, financial transactions, or your Netflix watch history.

Pro/con list. Pros: No schema (flexible), can handle large data, horizontal scaling. Cons: Not suitable for complex queries.
Pro/con list. Pros: No schema (flexible), can handle large data, horizontal scaling. Cons: Not suitable for complex queries.

When to use this technique

  • Frequent writes
  • Infrequent updates or reads

This is still not general-purpose. Hence, it can be used for storing historical data from all of our different apps.

Document-Style Database

This is one of the most popular database techniques we use. This obviously consists of documents, and each document is a set of keyvalue pairs. They’re unstructured and don’t require a schema.

This obviously consists of documents, and each document is a set of key-value pairs.
This obviously consists of documents, and each document is a set of key-value pairs.

Documents are grouped together into collections, and these collections can be structured into a logical hierarchy.

This logical collection of data lets you group your related data in a much more logical manner, which seems similar to a relational database.

Pros: No schema (flexible), can run simple queries, faster reads. Cons: Can’t run complex queries, data denormalization.
Pros: No schema (flexible), can run simple queries, faster reads. Cons: Can’t run complex queries, data denormalization.

Since our database can’t run join queries, what do we do to get all that related data at once?

We store it all together! We encourage the denormalization of your database and data duplication/inconsistency is a trade-off we’re ready to make.

Reads are really fast, but writing and updating the data while making sure its consistent can be a bit of a challenge.

The document database is a really great fit for general-purpose apps and might be a good fit for most of the apps, games, and IoT.

If you really aren’t sure about your database schema, then a document database is the best way to start.

Popular document-type databases

Logos of MongoDB, Amazon DynamoDB, Cloud Firestore, and Apache CouchDB Relax.
Logos of MongoDB, Amazon DynamoDB, Cloud Firestore, and Apache CouchDB Relax.

The document-style database falls short when you have a lot of data, which might be directly or indirectly related to each other.

For those scenarios, you’ll have to run multiple complex queries and then merge all the received data in the front-end application, or you can use a relational database where these complex queries are managed by the database.

Relational Database

We all have heard about these databases, and the most popular ones are MySQL, Postgres, and SQL Server. They’ve been here for a while and still are a popular choice for many applications.

We use Structured Query Language (SQL).

Significance of “relational”

Imagine a car factory where there are different hubs that manufacture the car parts.

Let’s say the doors are manufactured at one place, while the wheels, the body, and the interiors are all made at their own different locations.

Imaginary car-factory blueprint
Imaginary car-factory blueprint
Imaginary car-factory blueprint

Each manufactured part has a unique ID assigned to it.

So once a car has to be assembled, you can fetch all of the parts from all of these different locations and assemble your car.

An image of a car with lines showing it has taken a part from body manufacturing, a part from wheel manufacturing, etc.
An image of a car with lines showing it has taken a part from body manufacturing, a part from wheel manufacturing, etc.

For building such a factory, we’d create the blueprint for such a factory, which makes sure that the overall process of manufacturing the car is very efficient and optimal. This blueprint is called a schema when it’s used in a database.

Hence, we need to plan out the schema for our database to make sure our database is also very efficient for our application’s data needs.

The downside

  • Just like how, over time, changing the layout of a car factory to coincide with changing requirements will cost a car company a ton of time and money, it’s a similar situation when a large-scale application has to do so. Make sure to use a relational database when your requirements are clear.
  • Also, once you build a plant with the capacity of manufacturing 30 cars a month, you can’t easily scale your plant to make 90 cars a month. Similarly, our relational databases can be harder to scale, but there are some exceptions like Cockroach DB and PostgreSQL that are designed to work at scale.

The good side

  • The SQL databases are ACID-compliant, which means our data validity and integrity are never compromised, even though the read-write operation might fail in between — which makes it a great fit for banking/financial-related data
  • Once you have a schema in place, you can be assured the data stored will always be stored in a fixed structure following a set of validations, which you’d have defined in the schema

What’s the best fit for you?

  • If your requirements are clear and you’re sure that you won’t be needing any drastic changes to the requirements, go for this one
  • If you’re not so sure about the requirements and are in sort of an experimental phase, it’s better to go with a NoSQL database

But what if we don’t need to create a schema and can directly store the relationship as data?

Graph Database

Here our data is stored in the nodes, and the relationship is defined as edges. It’s pretty sweet! Let’s see how.

If you have to find out all of the students studying computer science in a SQL database, you’ll need a lookup/middleman table that stores records of all the students studying computer science separately.

If you have to find out all of the students studying computer science in a SQL database, you’ll need a lookup/middleman table
If you have to find out all of the students studying computer science in a SQL database, you’ll need a lookup/middleman table

In graphs, it’d just be way more straightforward, as we won’t have to store the relationship part of our data separately, and it comes instinctually with this new style.

Relationships are easier to record and maintain in graphs
Relationships are easier to record and maintain in graphs
Relationships are easier to record and maintain in graphs

With this new way of directly showing the relationship between two nodes, our complex join queries here become much more simple, greatly improving the performance of the database when compared to SQL.

Hence, use such a database when you’re dependent on a lot of join operations and the performance is degrading because of that.

Searchable Database

If you’re building an application like Google, where on a small-string query search you have to return all of the matching records fast — you’re talking about a full-text search engine.

These databases are based on the Apache Lucene project started in 1999.

Algolia and MeiliSearch are full-text search engines.

They look similar to a document-type database. We have an index, and we add data objects to it. The search-database engine will analyze all of the text in the document and create something called inverted indexes.

When you query something, the database only goes and checks in the inverted indexes, which makes the overall process look blazing fast, even for a large database.

I saved the most exciting one for last.

Multi-Model Database

There are multiple options out there, but the most popular one seems to be Fauna.

As application devs, we usually just care about the JSON, which we can consume in the front end of our apps.

With Fauna, we don’t have to worry about data modelling, the schema, scaling, replication, or normalization and simply just get our JSON data. We define how we want to access our data using GraphQL.

Let’s take an example of a scenario of an Instagram-like app. We’ll define our rules using JSON for a User, Post, and Query.

We’ll define our rules using JSON for a User, Post, and Query.
We’ll define our rules using JSON for a User, Post, and Query.

We just uploaded our GraphQL schema — it automatically creates a collection to store data and indexes to query the data.

Behind the scenes, it’s figuring out how to take advantage of different paradigms like relational, graphs, and document based on the GraphQL schema you provided.

We just add our data simply in the same way we did in a document database, and we’re not stuck with the limitations of data modelling.

The best part — this is ACID-compliant and is extremely fast.

You don’t need to worry about the infrastructure. Just define how you need the data, and the cloud will take care of the rest for you.

Downside

Obviously the pricing is a downside. Great things don’t come free, but they do have generous plans/open-sourced options for devs who want to learn — and small startups as well.

Pro/con list. Pros: No schema, ACID-compliant, extremely fast, cloud does the heavy lifting. Cons: Can get expensive.
Pro/con list. Pros: No schema, ACID-compliant, extremely fast, cloud does the heavy lifting. Cons: Can get expensive.

Here are some great features worth noting listed by Fauna:

Table of Fauna features like flexible data modeling, global ACID transactions, etc.
Table of Fauna features like flexible data modeling, global ACID transactions, etc.

We’re not done! There’s so much more to learn different databases, but I hope this was a good introduction to the various options we have as developers to use in our apps.

Thanks for reading. Have a good one!

Better Programming

Advice for programmers.

Sign up for The Best of Better Programming

By Better Programming

A weekly newsletter sent every Friday with the best articles we published that week. Code tutorials, advice, career opportunities, and more! Take a look

By signing up, you will create a Medium account if you don’t already have one. Review our Privacy Policy for more information about our privacy practices.

Check your inbox
Medium sent you an email at to complete your subscription.

Thanks to Zack Shapiro

devDeejay

Written by

devDeejay

Developer who loves to build beautiful apps and write great Medium stories to share what I have learned.

Better Programming

Advice for programmers.

devDeejay

Written by

devDeejay

Developer who loves to build beautiful apps and write great Medium stories to share what I have learned.

Better Programming

Advice for programmers.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store