The Different Types of NoSQL Databases

Different database fits different project. Choose wisely.

János Ruszó
CodeX
4 min readAug 20, 2022

--

Photo by fabio on Unsplash

Traditional relational databases are with us for a very long time. Originally, it was developed by Edgar Codd in 1970, and gained popularity. Nowadays, RDBMS are a crucial part of our daily life in databases.

The explosive spread of the Internet and distributed computing put RDBMS to challenges where it was no longer adequate. These new systems required data stores that are fast, can store huge quantities of data, scale horizontally and make schema changes faster, and easier. Therefore a new family of databases was born, called NoSQL databases. The primary benefit of NoSQL databases is that they can handle unstructured data, such as files, emails, social networks etc…

Due to the benefits above, companies such as Google, Amazon, eBay… migrated much of their workload to NoSQL databases, due to the benefits highlighted above.

NoSQL servers can store and retrieve data in different formats. There are different approaches for categorizing NoSQL databases. The most commonly used classification was introduced by Ben Scofield in his presentation at Codemash conference in 2010. He presented the categories as follows:

Key-value stores

The data is stored as pairs of key and value. This data structure is also called as “Hash tables”. In a key-value database, every data is retrieved based on it’s key, and every key can appear once in the dataset, therefore the key is the unique identification of the data.

Example storage in a key-value store. Source: Wikipedia

These systems are usually the fastest due to their simple implementation and can scale out easily.
Examples of such databases: Memcached, Redis, Amazon DynamoDB, RocksDB, Berkeley DB, Oracle NoSQL Database

Column-family stores

Column-family stores are very similar to key-value stores. In column-family stores, the data is organized in a way, that rows have a unique identifier (key), and belonging to a row we have 1 or more columns. Column-family stores can support very wide rows consisting of thousands of columns. The rows are grouped together into a Column-family, which is similar to a table.

Cassandra’s data model. Source: ScyllaDB Documentation

Due to their similarity to key-value stores, they are highly scalable and depending on implementation, can support very wide rows.
Examples of such databases: Google Bigtable, Amazon Redshift, Apache HBase, Hypertable, Cassandra, ScyllaDB, Clickhouse etc…

Graphs databases

Graph databases can be used, when the data can be represented as a graph with interlinked elements. Graph databases store the data in a graph structure, represented by nodes, properties, and graph edges.
In a graph database, a node represents an entity, properties represent the information of the entity, and edges describe the relationship between entities. Graph databases are a good match for example for social network graphs, road network representation, and route planning.

Source: Wikipedia

Due to the complexity, scalability of graph databases varies between implementations.
An example list of such databases are: Neo4j, Amazon Neptune

Document-oriented databases

Document stores represent the data in so-called documents. A document is a semi-structured data. The documents are self-describing, so in most cases, no the database does not use a pre-set schema, every document contains the key/value pairs individually inside the document. The implementation of the document differs between the technologies, such as: JSON, XML, YAML. Document databases usually provide extra functionality such as store of lists, pointers and nested documents. The documents are referenced by a unique key, which is usually a string, URI, or path.

Structure of document-oriented storage

Examples of such databases are MongoDB, Amazon DocumentDB, CouchDB, ElasticSearch

As we can see, depending on the data we are planning to store, numerous different NoSQL databases are available. Even within the categories, the individual databases provide different functionality, different benefits, and limitations. Also, there are many other databases present in each category, specialized in different niches within the category, so choosing a database for a project is a complicated task and requires time and effort.

Resources

--

--

János Ruszó
CodeX

A Senior Database Engineer focused who is focused on MySQL and exploring other technologies like MongoDB, BigQuery, Clickhouse…