To SQL or NoSQL, that was the Question

Published in

The Cloud Builders Guild

7 min readFeb 26, 2020

I’ve recently been evaluating NoSQL databases as a possible candidate for our OLTP data storage requirements. Given the good things which I have heard about NoSQL in general, I wanted to ascertain whether it would be a good fit for the space that I work in. If I could summarise that space with a few dot points, it would be:
- enterprise line of business (LOB) applications
- relatively small user-base
- real-time reporting requirements

I’ve spent all of my career building/maintaining bespoke applications for enterprise. These applications can be very big. But their user-base never is; it is not the whole of the general public. It may be 5 users, it may be 50,000. But it is never millions. These applications always need to report data back to the user. So, if the user adds some data and immediately runs a report, that data must be included in the report.

Keeping those high level requirements in mind, I’ll walk through my journey with NoSQL and explain how I came to the conclusion that NoSQL was not really a good fit for the space which I work in (despite my appreciation for it as a great technology).

The ME*N (MongoDB, Express.js, *, Node.js) stack seemed to be the best place to start and as I only intended to build an API for my evaluation, there was no need for a front-end framework (hence the asterisk). So, I built a small API using those technologies.

Domain Modelling

One of the things which became apparent to me during this process was the need to think about your domain differently; the need to model it differently to adapt to the NoSQL approach to storage. Data which is accessed together is stored together. (But this must be given very careful thought, as even smart people can get it wrong). This is very necessary to reap the benefits of this kind of database and I concede that a shift in thinking at the domain creation stage is not going to come quickly (especially over a couple of weekends and possibly not even on the first project). But this is not the kind of thing which should deter the adoption of a technology. Hard work, lots of research and asking questions on forums can help clear this hurdle.

I can see how this kind of domain modelling would be brilliant for most mobile applications. Such applications mostly contain data which are centric to the user. So, with many of these apps, you can store all the data relevant to that user in one big object, which can be retrieved very quickly when the user opens the app (says the guy who’s never made a mobile app).

And this is one aspect where requirements differ somewhat from the apps that I work on. Quite often, the bulk of the data has little to do with the user themselves. There’s mountains of generalized data which the user needs to consume/mutate/report on for the purposes of their job. Not the kind of scenario where you can just drop a bunch of many and varied object graphs on disk and easily retrieve them. Especially with the filtering requirements on a variety of views, which often carve up the data in all kinds of wonderful ways. So this was the first item that I noted which was a strike against NoSQL for my work.

Schema

When working with Mongo, there was something that I found which was somewhat curious. Many devs use Mongoose to interact with MongoDb. This involves adding schema, so that your models match a schema for the relevant collections. On one hand, I can see the need for this, as your application will have a domain, despite the fact that your database enforces no schema. On the other hand, I wonder, “why use a schemaless database?”, only to enforce one in your app? This was not a huge issue, as Mongoose does permit you to tell it to ignore schema and it provides a lot of other goodness in terms of interacting with MongoDb. This article by Rick Negrin makes some pretty good points about schema and essentially deems the notion of having a schemaless domain as a fallacy in so far as it kicks the can down the road from those who put data in, to those who pull it out.

As a technology, the schemaless capability is very cool and you can see a great example of it in this talk (I’ve queued the time in the link). But the lingering question for me is, “does it solve a problem that I don’t have”. And the answer to that was, “yes”. If I want related data, I perform the dreaded join. And here’s the dirty little secret, they are actually pretty fast. My users are not kept waiting. In fact, the database access and deserialisation of that data into objects in memory is rarely the slowest link in the chain. There are odd occasions (for example, a hefty report) where data retrieval can be expensive. But for the day-to-day running of my applications, I do not need to be eschewing joins. It just isn’t a problem. This is the second item which was a strike against the adoption of NoSQL for my work.

Access Patterns

This brings me to my next observation, the inflexibility of Access Patterns. I found that when working with this kind of architecture, your access patterns for requests do become somewhat rigid vis-a-vis the objects which you are storing on disk. Alex DeBrie talks about this in this article about DynamoDb. If I were to distill this idea, the handling of a particular request becomes coupled with the way the data is stored in the database. You tune your table-design to the access pattern, making the access pattern harder to evolve over time. This was a big blocker for me. There is so much that I do in my work to decouple the moving parts of my applications, with a view to making them more maintainable. This idea of tightly coupling the database with requests seemed anathema to such efforts. So, NoSQL is not matching my requirements with this issue.

Reporting

As mentioned above, reporting requirements are a common requirement of the applications that I work on. And this includes ad hoc reports. So, if a user mutates some data and runs a report 30s later, they have an expectation that the change which they made will be reflected in the resulting report. As I understand it, Mongodb had a few issues with writes not being guaranteed in the past. You writes became a fire and forget situation. But it has apparently addressed that, so this should meet my requirement for the immediacy of reports. However, the actual querying may be an issue. In his article referenced above, Alex DeBrie notes that analytics can be problematic with the kind of table/collection design required by a NoSQL database. This means that you need to come up with some kind of solution for routinely pulling the data out of the NoSQL database and storing it again in a different data store which is more analytics-friendly. Such an approach would fall foul of my requirement for immediate data. Your analytics data would always be behind, by the time duration between such sync operations. Or, you could just use a SQL database and query the data for your reports. I believe SQL databases win hands down for reporting, in the space which I work in.

Perf

Another feature of NoSQL databases which definitely addresses the issues of many applications is performance. Blazing fast performance. Particularly writes. I could not really notice anything too magical here, as my minor meanderings with MongoDb did not really put it to the test. But in any case, this is not a requirement for the applications I work on. SQL databases are plenty fast enough. If I was getting millions of hits every couple of seconds, things would be different. This leads me to my final point.

Scalability

The final consideration which tanked the business case for a switch to NoSQL was another problem which did not need solving. Scalability. My apps never need to scale beyond the capability of a relational database. Recall that I said my user base is always pretty small. I do not get millions of requests per second. Heck, I do not get millions of requests per minute. On the current app I am maintaining, for one particular customer, I can see hours go by without a single http request. They use the app when they need to. As such, whilst acknowledging this as a great strength of the NoSQL databases, scalability is not a problem for me (yet).

Conclusion

It’s almost dizzying the number of tools which we have at our disposal to address the business needs of our clients. Each tool is best suited to a certain set of requirements. The various NoSQL databases are great and have been very helpful solving problems for some organisations. The one sentence which really leapt out at me from Alex DeBrie’s article was the observation that NoSQL is probably not the answer, “whenever I need query flexibility and/or easier analytics more than I need blazing fast performance.” And that’s my situation right there. Hence our decision to continue on with relational databases in meeting the business requirements of our customers.