SQL vs. NoSQL: Build modern apps with Purpose-built databases

13 min readFeb 12, 2023

The advancement in technology has greatly impacted the database landscape over the past thirty years. Initially, the relational database was the most widely used option for applications, but with the rise of internet-enabled and cloud computing applications, the demand for databases with faster and more global capabilities increased. This shift has led to the emergence of purpose-built databases, which are tailored to specific application needs. Developers now have a range of options, such as document databases for mobile applications with heterogeneous data, graph databases for tightly related data, and in-memory cache for high-speed applications. This article briefly describe the use of five different purpose-built databases in applications and the factors to consider when choosing the right database. This article is designed to help software architects, developers, and data scientists make informed decisions when building applications with databases.

Background

Let’s examine the evolution of application architecture over recent decades to grasp the significance of databases designed specifically for a purpose. Afterwards, we will delve into the functioning of databases in current application development.

Evolution of application architectures

The evolution of application architecture has seen significant changes in recent decades, starting with the use of mainframes for critical applications, which combined compute and storage functions. The shift to client-server architecture allowed for more distributed systems and scalability. The rise of the internet saw the implementation of the three-tier architecture, with applications separated into presentation, application, and data tiers for increased scalability. The latest trend is microservices, where applications are split into separate services based on their functionality for better agility and independent scalability.

Databases in the world of microservices

The evolution of databases has paralleled that of application architectures. Initially, hierarchical databases were used for internal systems and stored data as a tree with limited ability to model complex relationships. The rise of relational databases with strict schemas, normalized records, and SQL query language dominated the market with an emphasis on data integrity and storage cost savings. The advent of web-enabled global businesses and cloud computing sparked the emergence of NoSQL databases for better performance, and the trend towards microservices in application development paved the way for purpose-built databases, allowing developers to choose the best fit for their application.

Factors to consider when choosing a database

Selecting the right database is a crucial aspect of your application architecture that has a significant impact on the access patterns, performance, and operational responsibilities of your application. A range of factors should be taken into consideration, such as the workload of your application, the form of your data, the desired level of performance, and the operational burden.

Application workload

When deciding on a purpose-built database, it’s crucial to consider the application workload. The workload refers to the type of data stored and the data access patterns in the application. There are three main categories of workloads: transactional (OLTP), analytical (OLAP), and caching. Transactional workloads involve a high number of concurrent operations with each operation reading or writing a small number of rows, commonly seen in user-facing apps like ecommerce and social networking. Analytical workloads aggregate and summarize large amounts of data with fewer concurrent queries but operating on many more rows per query, typically used in internal reporting apps. Caching workloads involve storing frequently accessed data in a separate database for faster response times, reducing the load on the transactional database and improving response times to users. Knowing the type of workload will aid in choosing the right database for your use case.

Data shape

The second factor to consider when selecting a database is the data shape. This refers to the entities you will model, the relationships between these entities, the ways you will access your data, and the frequency of entity updates. There are several common data models to choose from, including relational databases, key-value or wide-column databases, document databases, and graph databases. It’s important to understand your access patterns to make the best choice for your application.

Performance

The performance requirements of your application is another crucial factor in selecting a purpose-built database. This encompasses not only the speed of data access and size of records, but also the proximity to end users. For critical workloads with users awaiting a response, speed is essential and an in-memory cache may be necessary to reduce latency. However, for internal analytics or background data processing, speed may not be as crucial and attention may shift to handling the amount of data. It is important to also consider geographical needs for your data. Some databases offer easy replication globally, bringing data closer to users and reducing response times. Examples include PostgreSQL, Amazon DynamoDB and Amazon Aurora.

Operations burden

When choosing a database, it’s important to consider the operations burden it will bring. This includes preparations for instance failures, backups, and upgrades. By choosing a fully managed database service, like those offered by cloude platforms, the operations burden is handled for you, allowing you to focus on developing features for your users.

Well suited database

Table below briefly help you to get a quick look at the databases portfolio. Following description provides information let you choose when apply each of them.

Relational

Relational databases are a type of database that store data in a structured format using tables, rows, and columns. They are based on the relational model, which organizes data into one or more tables and defines relationships between the tables using keys.

Some common use cases for relational databases include:

E-commerce: Online stores often use relational databases to store customer information, product details, and order history.
Financial services: Banks and other financial institutions use relational databases to track customer information, transactions, and account balances.
Healthcare: Hospitals and clinics use relational databases to store patient information, medical records, and appointment schedules.
Human resources: Companies use relational databases to store employee information, payroll data, and benefits information.
Inventory management: Retailers and manufacturers use relational databases to keep track of stock levels, suppliers, and purchase orders.

These are just a few examples of the many ways that relational databases are used. In general, they are a popular choice for applications that require a high degree of structure and organization, as well as the ability to quickly retrieve and manipulate data.

Key-Value

Key-value databases, also known as key-value stores, are a type of NoSQL database that store data as a collection of key-value pairs. In this type of database, each piece of data is identified by a unique key, and the value associated with that key is the data itself. The key-value pair can store simple data types such as strings or numbers, or more complex data structures like arrays or objects.

Key-value databases are designed to provide fast and efficient access to data, with a simple and flexible data model that makes it easy to scale and manage large amounts of data. They are often used for applications that require a high degree of speed and scalability, such as:

Caching: Key-value databases are often used as a caching layer in front of a traditional relational database, to speed up data retrieval by storing frequently accessed data in memory.
Session management: Key-value databases can be used to store session data for web applications, allowing developers to store information about a user’s session and retrieve it quickly as needed.
Gaming: Key-value databases are used to store game state information, high scores, and other data in real-time multiplayer games.
Distributed systems: Key-value databases are often used in distributed systems to store data across multiple nodes, providing high availability and fault tolerance.
Metrics and analytics: Key-value databases can be used to store and process large amounts of time-series data, such as metrics and analytics data from web applications and mobile apps.

These are just a few examples of the many ways that key-value databases are used. They are well-suited for applications that require fast data retrieval and flexible, scalable data storage.

Graph Databases are designed to identify and work with the connections between data points; are often used to analyze the relationships between heterogeneous data points (high variability of data types and formats), such as in fraud prevention or for mining data about customers from social media.

In-Memory

In-memory databases are databases that store data entirely in RAM (Random Access Memory), instead of on disk. This means that all data is kept in memory, allowing for extremely fast data access times, as accessing data in RAM is much faster than accessing data on disk.

In-memory databases are used in a variety of applications where performance is critical, such as:

Real-time analytics: In-memory databases are well-suited for real-time analytics, as they can process large amounts of data quickly and provide instant insights.
Gaming: In-memory databases are used in gaming applications to store game state information, player information, and other data, allowing for real-time updates and fast response times.
Trading systems: Financial trading systems use in-memory databases to store and process real-time market data and execute trades quickly.
Caching: In-memory databases can be used as a caching layer in front of a traditional relational database, to speed up data retrieval by storing frequently accessed data in memory.
Fraud detection: In-memory databases can be used in real-time fraud detection systems, where data must be processed quickly to identify and prevent fraudulent activity.

While in-memory databases are fast and efficient, they also have some limitations, such as their high cost, as large amounts of RAM can be expensive, and the limited amount of memory available on a single system, which may limit the size of the data that can be stored in memory. Additionally, if the system loses power or crashes, all data stored in memory will be lost. For these reasons, in-memory databases are typically used in combination with disk-based databases to provide a balance of performance and persistence.

Document

Document databases are a type of NoSQL database that store data in the form of documents, rather than tables, rows, and columns. A document can be thought of as a collection of key-value pairs, where the keys are field names and the values are the data stored in those fields. Documents can also contain nested data structures, such as arrays and sub-documents.

Some common use cases for document databases include:

Content management: Document databases are often used to store and manage unstructured content, such as text, images, and multimedia, in a way that allows for easy retrieval and manipulation of the data.
E-commerce: Online stores often use document databases to store product information, customer information, and order history, allowing for flexible and scalable data storage.
Real-time analytics: Document databases are well-suited for real-time analytics, as they can store and process large amounts of semi-structured data quickly and efficiently.
Gaming: Document databases are used in gaming applications to store player information, game state information, and other data, allowing for real-time updates and fast response times.
Logging and event tracking: Document databases can be used to store and process log data and event data from web applications and other systems, providing a flexible and scalable data storage solution.

Document databases are often favored for applications that require a high degree of flexibility in terms of data structure, as they allow for the storage of semi-structured and unstructured data in a way that is easy to manage and retrieve. Additionally, they are well-suited for use in distributed systems, as they provide a scalable and flexible data storage solution that can be easily partitioned across multiple nodes.

Wide column

Wide column databases, also known as column-family databases, are a type of NoSQL database that store data in a column-oriented format, rather than in rows and columns like a traditional relational database. In a wide column database, data is organized into columns, and each column can contain a different type of data.

In a wide column database, data is typically stored in the form of a column family, which is a collection of related columns. Each row in a column family is identified by a unique key, and the data for that row is stored in the associated columns.

Wide column databases are well-suited for applications that require high write and read performance, such as:

Big data processing: Wide column databases are often used to store and process large amounts of semi-structured and unstructured data, such as log data, event data, and sensor data.
Real-time analytics: Wide column databases can be used for real-time analytics, as they can store and process large amounts of data quickly and efficiently.
Gaming: Wide column databases are used in gaming applications to store player information, game state information, and other data, allowing for real-time updates and fast response times.
Financial systems: Wide column databases are used in financial systems to store and process large amounts of financial data, such as stock prices and trade data.
Social media: Wide column databases are used in social media applications to store user information, posts, and other data, allowing for flexible and scalable data storage.

Wide column databases are often favored for applications that require fast and flexible data storage, as they allow for easy and efficient storage and retrieval of large amounts of semi-structured and unstructured data. Additionally, they are well-suited for use in distributed systems, as they provide a scalable and flexible data storage solution that can be easily partitioned across multiple nodes.

Graph

Graph databases are a type of NoSQL database that store data in the form of nodes and edges, rather than tables, rows, and columns. In a graph database, nodes represent entities or objects, and edges represent relationships or connections between those entities.

Graph databases are used to store and process data about relationships and connections between entities, making them well-suited for applications that require the analysis of complex, interconnected data. Some common use cases for graph databases include:

Social network analysis: Graph databases are often used to store and analyze social network data, such as friend relationships and connections between users on a social media platform.
Fraud detection: Graph databases are used to detect fraud by analyzing relationships between entities, such as transactions and people, to identify patterns of behavior that are indicative of fraud.
Recommendation systems: Graph databases are used to store and analyze user behavior data, such as product purchases and online searches, to provide personalized recommendations to users.
Knowledge graphs: Graph databases are used to store and manage knowledge graphs, which are graph-based representations of knowledge about a particular domain or subject.
Network analysis: Graph databases are used to store and analyze data about networks, such as telecommunications networks, transportation networks, and biological networks.

Graph databases are often favored for applications that require the analysis of complex, interconnected data, as they provide a flexible and scalable way to store and process data about relationships and connections between entities. Additionally, they are well-suited for use in real-time systems, as they allow for fast and efficient querying of graph data.

Time-series

Time series databases are a type of database specifically designed to store and manage time-stamped data. Time series data is data that is collected and recorded over time, such as financial data, sensor data, and log data.

Time series databases are optimized for handling time-stamped data, and are designed to store, retrieve, and aggregate time-series data efficiently. Some common use cases for time series databases include:

IoT data: Time series databases are used to store and process large amounts of sensor data from IoT devices, such as temperature, humidity, and pressure data.
Financial data: Time series databases are used to store and analyze financial data, such as stock prices, trade data, and economic indicators.
Log data: Time series databases are used to store and analyze log data, such as server logs, application logs, and network logs.
Metrics and monitoring: Time series databases are used to store and aggregate metrics data, such as system performance data, application performance data, and user behavior data.
Environmental data: Time series databases are used to store and analyze environmental data, such as weather data, air quality data, and water quality data.

Time series databases are often favored for applications that require the efficient storage, retrieval, and aggregation of time-stamped data, as they are optimized for handling this type of data. Additionally, they are well-suited for use in real-time systems, as they allow for fast and efficient querying of time-series data.

Ledger

Ledger databases, also known as blockchain databases, are a type of database that maintains a tamper-resistant, sequential record of transactions. In a ledger database, transactions are recorded in blocks, and each block is linked to the previous block, creating a chain of blocks, or a blockchain.

Ledger databases are designed to provide a secure and transparent way to store and manage data, and are commonly used in applications that require a high degree of trust and accountability, such as:

Cryptocurrency: Ledger databases are used to store and manage transactions in cryptocurrencies, such as Bitcoin and Ethereum.
Supply chain management: Ledger databases are used to track the movement of goods and materials through a supply chain, ensuring transparency and accountability throughout the supply chain.
Healthcare: Ledger databases are used to store and manage patient health data, providing secure and transparent access to health data to authorized parties.
Voting systems: Ledger databases are used to provide a secure and transparent way to manage and count votes in elections, ensuring the integrity of the voting process.
Real estate: Ledger databases are used to store and manage real estate transactions, providing a secure and transparent record of ownership and property data.

Ledger databases are often favored for applications that require a high degree of trust and accountability, as they provide a secure and transparent way to store and manage data. Additionally, they are well-suited for use in decentralized systems, as they provide a decentralized way to store and manage data that is resistant to tampering and manipulation.

Conlusion

Purpose-designed databases are databases created with a specific function or objective in mind. These databases are designed to meet the specific needs of a particular application or industry, and are optimized for the specific tasks they are intended to perform. As a result, they can provide more efficient and effective performance compared to general-purpose databases.

For more details you can examine following links: