Recap Of All Existing Database Types You Must Know About
No, SQL and NoSQL are just not enough
A long time ago, I wrote an article talking about the differences between SQL and NoSQL databases. It was quite successful at the time. And I thought I showed, at that moment, a comprehensive list of database types.
Well, I was so wrong.
First of all, have you ever heard about the term polyglot persistence?
“Polyglot persistence is about using different types of storing technologies, to fulfil persistence needs inside an application”.
Let me give you an example. Suppose you’re writing a social media app like Facebook or Instagram. You use a SQL database to better memorize relationships between people. Who they follow, what hashtags they follow.
Of course, in this app, you want to store data about user sessions. Let’s do that using a key-value store like Redis. Which will provide low latency and great performances.
What if you want to perform analysis on Big Data to have user analytics? You should use a wide-column DB for doing that.
Now your users have increased, and you want to provide them with a better service. You know what they need. And that is a recommendation system. Let’s use a graph database to fulfil this need.
Lastly, let’s help them to find their friends. Let’s use a document-oriented DB like ElasticSearch to find stuff.
See, there are so many different DB technologies out there. So many ways to do stuff. And like in the example just given, so many ways to combine stuff together. Having such a polyglot system makes up for some though managing time. That goes through maintaining, testing, and monitoring all the living ecosystems in our app universe.
Is there something easier for doing this? The answer is yes. And I reveal that at the end of the article. First, I want to help you shrug off the knowledge weight we just created.
Let’s find out more about different DB technologies.
A document-oriented database is the main type of NoSQL database. It stores data in independent documents, which are usually semi-structured or in a JSON-like format.
These kinds of databases are great for agile software development. But also for applications with constantly changing requirements.
So if you are working with semi-structured data, or need a very flexible structure. One that might change over time, especially when beginning a project. These types of databases also provide great possibilities for horizontal scalability. And quick read/write operations for the database.
Typical usage for these systems are
- Inventory management
- Live sports apps
- Managing user comments
- Web-based multiplayer games
Case Study: Coinbase scaling using MongoDB
Graph databases are also inside the NoSQL family. They store data in the form of node/vertices, plus what we call edges to form relationships.
To each node corresponds an entity, and edges represent the relationships between them.
These databases are especially good for two main reasons
- When you need visualization: a graph is one of the most common ways to display clear data right? Let’s use this system for databases too.
- Low latency: graph databases are faster than SQL ones when it comes to relationships. As they’re not calculated at query time. As it happens with joins in SQL persistence systems. These relationships get stored in the form of edges, that will then be fetched.
An ideal application example for this could be a map app. With cities being nodes and their connections being edges. Only fetch the edges and find out how they’re connected.
Also, these types of databases are pretty good for building AI-based apps, genetic data, or recommendation engines.
The most common graph system is Neo4j.
Case Study: How NASA learns from its errors using Neo4j.
Still in the NoSQL family here. They’re quite simple in their existence. Just a key-value pair. Used for minimum latency operations. The key serves the purpose of an identifier and has a value associated with it. The value can be something easy like a string or more complex like an object.
Use this type of persistence system when you need to fetch data with very low latency and minimal backend processing.
Ideal scenarios are
- Storing user state
- Storing user sessions
- Real-time data management
Case Study: Microsoft and Redis for handling traffic spikes.
Time Series Database
Optimized for tracking and persisting time series data. With this term, I refer to data that is collected in the occurrence of an event with respect to time. The data is tracked, monitored, and then aggregated in regard to certain business logic.
Usually, this type of data is used from self-driving cars, sensors, real-time financial applications.
So, as you can guess, the main point of collecting such data is to perform analytics. Which are vital for certain businesses. For predicting systems behaviors, and work accordingly.
Case study: Oracle uses Influx for creating analytics.
Primarily used to store and handle massive amounts of data. Or if you prefer, Big Data. High performances and scalability are the main pros of these systems. They store data in a record with a dynamic number of columns, also billions sometimes.
Case study: Why Netflix uses Cassandra for streaming.
Can I Manage This With A Single DB?
The answer is YES. Actually with the advancements in technologies for storing data, special types of DB’s have been created. These are called multi-model persistence systems. They have the ability to let you use different data models inside a single platform. Graph database combined with document-oriented for example. Check more about them with popular ArangoDb, CosmosDb, or Couchbase.
Hopefully this article has reached its primary goal when it was written. To inform, to inform the readers, To showoff the various types of persistence you can apply inside your project. Benefiting in performances and capabilities.
- Icons from Icons8
- Polyglot persistence
- Horizontal scalability
- MongoDB, CouchDB, Google Cloud Datastore
- Coinbase scaling using MongoDB
- How NASA learns from its errors using Neo4j
- Redis, Memcached, Hazelcast, Riak
- Microsoft and Redis for handling traffic spikes.
- Influx DB or Timescale DB
- Cassandra, Hbase, Google BigTable
- Oracle uses Influx for creating analytics.
- Why Netflix uses Cassandra for streaming
- ArangoDb, CosmosDb, or Couchbase.
A note from the Plain English team
Did you know that we have four publications and a YouTube channel? You can find all of this from our homepage at plainenglish.io — show some love by giving our publications a follow and subscribing to our YouTube channel!