NoSQL

Jeyasumangala Rasanayagam
6 min readFeb 18, 2020

--

What is NoSQL?

NoSQL means non SQL or non relational database for store and retrieve data. Here, “No” means “not only”. It does not refer to the meaning of “No”. Meaning behind this is many NoSQL databases support SQL like queries. It does not need any fixed schema. Mainly NoSQL is used for big data. It means this database system is used by the companies like Facebook, Google that generate a huge amount of data every single second. So NoSQL is considered as as suitable database system for this. NoSQL is existed for many years but it recently become as more popular.

Difference between SQL and NoSQL

  • SQL refers to the Structured Query Language. This is a relational database that stores data into tables and SQL created an interface to interact with it. NoSQL database do not follow the rules of relational database. It does not store data in a traditional tables and also it does not use SQL to query the data.
  • SQL database have a pre defined schema but NoSQL use dynamic schema for unstructured data.
  • SQL database is vertically scalable while NoSQL database is horizontally scalable. SQL databases is scaled by increasing the horse power of the hardware. It means by increasing CPU, RAM the scalability of SQL is increased. while NoSQL is scaled by increase the database servers.
  • SQL uses structured query language to defining and manipulating the data in the database. But in NoSQL database, queries are focused on collection of documents. It is also called as Unstructured query language.
  • SQL database is most suitable for the complex queries. And NoSQL is not fir for the complex queries. because NoSQL don’t have interfaces to perform the complex queries and also the queries of NoSQL are not much powerful as SQL queries.
  • SQL Databases are not suitable for the hierarchical data storage. But the NoSQL is a better one for the storage of the hierarchical data because NoSQL follows key value pair way of storing data.
  • NoSQL database is the best choice for storing the huge data set that is big data. SQL performance becomes slow when it stores the big data.
  • SQL database is emphazied on ACID properties and the NoSQL follows the CAP theorem.
  • Examples for SQL Databases :- MySql, Oracle express edition, Sqlite, Postgres and MS-SQL
  • Examples of NoSQl Databases :- MongoDB, Cassandra, HBase, CouchDB.

Features of NoSQL

  1. It is Non Relational
    It never follow the relational model. So it does not save data in the tabular format. And It does not require object-relational Mapping and data normalization. So it reduce the time for the development and more reliable.
  2. Dynamic Schema
    It is either schema free or have relaxed schema. Commonly NoSQL is based on the dynamic schema that stores the unstructured data.
  3. Simple API
    It offers the interfaces that easy to use for storage and querying data.
    And API allows low-level data manipulation.
  4. Auto-sharding
    NoSQL is based on the horizontal scaling, that means servers are added instead of increase the capacity of a single server.
    NoSQL provides the auto-sharding feature, that means it automatically spread data across the various number of servers. So, if any server fails, then data is replaced quickly and transparently without disrupting the application.
  5. Replication
    NoSQL allows automatic database replication. The reason for this is to maintain availability in case of outages.

Types of NoSQL

Mainly, There are four types NoSQL databases.

  1. Key-value stores

2. Column-oriented Graph

3. Graphs based

4. Document-oriented

Key-value stores

The data is stored as key- value pair and these databases are work on a simple data model that has a pair of unique key and a value. This is designed to handle heavy load of the data.

In the key-value stores, key means a string of characters and the value is a series of uninterrupted bytes that are opaque to the database.

The key-value stores have no query language. GET, PUT and DELETE commands are used to store, retrieve and update the data. This simple structure of this type makes key-value store fast, easy to access, scalable, portable and flexible.

Examples:-

  • Dynamo
  • Riak

Column Oriented graph

This type of database stores data in columns instead of rows. This type of database enables querying the large data sets faster than other conventional databases. Here the column don’t have to consist across records and we can add a column to specific rows without having to add them to every single record.

This type is commonly used for catalogs, manage data warehouses, Business intelligence and fraud detection. This database deliver the high performance on aggregation queries such as SUM, COUNT, AVG etc. as the data is readily available in a column.

Example:-

  • HBase
  • Cassandra
  • Hypertable

Graph Based

This type is slightly different from other types, because here, the data is represented as a graph of entities and the relationship between the entities, which each node in the graph a free-form chunk of data. The edges represents the relationship between the nodes. Every edge and node has the unique identifier.

When comparing to the relational or traditional database where tables are loosely connected, a graph database is considered as multi-relational in nature. So traversing relationship is fast. This type of databases are mostly used for social networks, logistics, spatial data.

Examples :-

  • Neo4j
  • OrientDB
  • FlockDB

Document based

Here, the data is stored as a key value pair but the value part is stored as a document or free-form JSON structures, where the data can be anything from integers to strings to free-form text.

This model is fir with use cases such as catalogs, real-time analytics and e-commerce applications, content management system where each document is unique. This is not well fit for the complex transactions which have multiple operations or queries.

Example:-

  • CouchDB
  • MongoDB
  • Riak

CAP Theorem

What is CAP Theorem?

This is published by Eric Brewer in 2000, so CAP theorem is also called brewer’s theorem.

C- Consistency
A-Availability
P-Partition Tolerance

It’s impossible to have all three requirements met, commonly combination of two requirements must be chosen.

Consistency

Consistency means the data should remain consistent even after the execution of an operation. If a transaction starts with the system in a consistent state and ends in a consistent state means that the system has consistency.

All the servers in the system will have the same data, so users can get the same copy of data regardless of which server answers their request.

Availability

Availability means the database should always be available and responsive to a request. It should haven’t any downtime.

Partition Tolerance

It means the system should continue to function even the communication among the servers is not in a stable mode. In another words, a system can sustain any amount of network failure that doesn’t result in a failure of the whole network.

NoSQL not support the ACID Properties. Relational database supports the ACID Properties.

Advantages and Disadvantages of NoSQL

Advantages

  • No Single point of failure
  • Big data capability
  • Simple to implement when comparing with relational database.
  • Provides fast performance
  • Horizontal scalability.
  • Suitable for structured, unstructured and semi-structured data.
  • Elastic scalability

Disadvantages

  • Less community support
  • No advance expertise
  • Limited query capabilities
  • No standard rules.

NewSQL

Most programmers are familiar with SQL and the relational database management system. After 2000, NoSQL is releases and it well suit for handling the big data.

And now there is new born child here, that is NewSQL.

NewSQL is modern SQL database that solve some of the major negatives in traditional online transaction processing relational database system.

The word NewSQL is not quite as broad as NoSQL. NewSQL means it is the relational data model and the SQL query language, and they all try to address some of the same sorts of scalability, inflexibility or lack-of-focus that has driven the NoSQL movement.

Adavantages of NewSQL

  • It is high scalable
  • It can handle complex data

Examples :-

  • ClustrixDB
  • NuoDB
  • VoltDB
  • CockroachDB

References

  1. https://www.guru99.com/nosql-tutorial.html
  2. https://www.infoworld.com/article/3240644/what-is-nosql-databases-for-a-cloud-scale-future.html
  3. https://www.hadoop360.datasciencecentral.com/blog/advantages-and-disadvantages-of-nosql-databases-what-you-should-k
  4. https://www.atlantic.net/hipaa-compliant-database-hosting/elasticsearch-distributed-nosql-database/
  5. https://softwareengineeringdaily.com/2019/02/24/what-is-new-about-newsql/

--

--

Jeyasumangala Rasanayagam

Intern- Software Engineer| WSO2, Undergraduate at Sri Lanka Institute of Information Technology