NoSQL: Don’t Let The Name Fool You

By Brian Rush, Senior Software Engineer, Capax Global

brianmrush
Hitachi Solutions Braintrust
6 min readOct 24, 2018

--

NoSQL is queryable data; it just solves problems differently and in some cases better…

As software engineers, we are asked to build a system that solves a whole host of problems. With experience, engineers will encounter similar problems on different projects, or even different companies. We sometimes approach solving the problem with a tried, true, and tested model. We think: why reinvent the wheel? In this article, I will try to expand your thinking about traditional relational database management systems (RDBMS’s). Most specifically, on how NoSQL databases can be used to solve problems that traditionally have been tackled using a relational database.

Let’s face it, computer systems — since their genesis — have been built upon relational databases. RDBMS’s are incredibly mature, powerful, and enabling. They permit the creation of data-driven systems that solve an enormous palette of business problems. Put simply, RDBMS’s are great! It’s no surprise then that software engineers are quick to reach for an RDBMS.

Let’s imagine a typical scenario we face as developers. A project kicks-off within a company and a new system is to be built. There is a design and discovery phase where the project takes shape. Early in the design and discovery phase, engineers have already started to identify the right tools to solve the problem.

It’s at this point in the design that, as engineers, we need to be careful we aren't doing ourselves a disservice by biasing our architecture choices to only the things that are familiar to us.

In the next few sections, I want to describe a problem that could be solved using an RDBMS, but may be better served by using a NoSQL database; in particular, Cosmos DB. I will try to break the mental model of using RDBMS as a default data store. After all, the famous musician Frank Zappa once said, “Without deviation from the norm, progress is not possible.”

Business Problem

For a moment let’s imagine we are building out an e-commerce solution for a B2B company that sells window coverings. A distinguishing characteristic is that the company does not manufacture the window coverings but rather sells products manufactured but a few dozen of their partners.

During the discovery and design phase, we identified some essential qualities and characteristics of the product catalog:

  • Each product has its own unique set of attributes and lots of them. Also, products share a core set of some core (common) attributes across all products.
  • The set of attributes can change frequently. Examples include interlude pricing, availability, product options, descriptions, etc.
  • Manufacturers each have their own unique way of representing a product and the relevant attributes.
  • Querying and displaying the product and all the relevant attributes are what helps customers understand what they are purchasing.

We could go through an exhaustive analysis phase and design a relational model in which all the products and attributes can fit into a well defined relational schema. It’s doable but difficult. To avoid a manufacturer changes having a severe impact on the product schema, options in the relational world are scant. The best relational tool would be an Entity-attribute-value (EAV) model — which is heavy and has serious performance disadvantages. This is the point in the project that if we consider a NoSQL approach, we see very significant value.

NoSQL Advantages

  • We can create documents that represent each of our products in a schema-less way. This means as our products and attributes change, so can our documents; free of fuss and complexity.
  • All of our products can still be easily queried. In particular, Cosmos DB supports querying documents using standard SQL syntax.
  • NoSQL and Cosmos DB databases are positioned to scale to extremely large volumes of data with the ease of a push of a button.
  • NoSQL databases are fast, with Cosmos even having guaranteed SLAs.
  • One could argue that the Object to Relational Impedance Mismatch is solved by leveraging NoSQL capabilities.

Specific Examples

Let’s drill a bit more into the details of our product catalog to exhibit how storing and querying our document catalog in a NoSQL database like CosmosDB has advantages. Take this sample JSON representation of a product:

In the RDBMS world, we would likely have a table for the product, brand, category, images, tags, and colors so we can store the relevant data about our product. Let’s suppose that in the future the manufacturer adds a new attribute for options. In that case, we would have to go to our RDBMS DB schema, add the options table, and then update our code that ingests the product information to save the new information to the data repository. Similarly, we would likely have to update our data access logic to be aware of the options table when querying our database.

The first important advantage of NoSQL is the fact that changes like this pose no problem. Within the NoSQL world, our product catalog can be schema-less. Where this has a distinct advantage over an RDBMS is that when the new options attribute appears, we don’t need to change our data storage in order to store this data. Likewise, this new data is immediately available to query. In the case of Azure Cosmos DB we can query this product using the SQL API using very similar SQL syntax.

An example of this using Cosmos DB as our storage is detailed below. This example is selecting any products where the brand = ‘Bali’. This is very similar to how we would select in an RDBMS except for we don’t have all the joins. We are using the data in the schema-less documents and querying the relevant data.

Similarly, we can write our SQL to use nested attributes in the document structure and reshape our return to produce new structures. In other words, shape the returned data to fit our object model. In the following example, we are joining on a subdocument and selecting out just the size and path of images where the size is small.

The main point of this is that our schema-less documents are still very much queryable using the familiar SQL syntax.

This is just a small example of how using a NoSQL database like CosmosDB may be better suited to solve problems that have traditionally been solved using RDBMS’s. The point is that if we open our mental model of solutions to new tools like NoSQL we may find we can build better systems.

The NoSQL database and especially Cosmos DB is a technology that is well suited to fit into the solutions for many common use cases. These include

  • IoT and Telematics Systems
  • Retail and Marketing Systems
  • Gaming
  • Social Applications
  • Recommendation Engines

Summary

The technology space is constantly evolving. Keeping our eyes and ears open to new technologies can result in better solutions for our applications. NoSQL databases and Cosmos DB have key advantages that you should consider. If you would like to find out more about Cosmos DB, I invite you to read this Cosmos DB primer article.

View all posts by brianmrush

Published October 24, 2018

--

--