Mutable Data is dead. Long live Immutable Data.


This is the second part of the 13 principles of next generation Enterprise Software Technology post.

  • Applying the scientific method to solve problems.
  • Measurement.
  • Flawless execution.

Should be at the core of every single enterprise.

You cannot execute well, without measurement and proper analysis. And you cannot really measure anything without having the data in the first place.

Next gen. of Enterprise Software Technology needs to be data oriented from the ground up. Data is the foundation for all businesses.

As an example, let’s consider sales funnel analysis, the most fundamental thing for a business, where you look at new and won opportunities and their whole lifecycle over time. The first questions you are going to ask are: How did the funnel look this month? How much did we win? What were the conversion rates at the different stages of the process. If you know the state for this month, you will want to know what happened last month, last quarter, during the whole year, and to see the fluctuations over time.

That brings us to the point that it’s not only the data itself that’s critical but also its time component.

The majority of Enterprise Software Technologies on the market today don’t solve that at all, as they overwrite the data, holding only most current state. It is worth emphasising what this means — that data in such technologies is mutable. They may implement a partial solution, by snapshotting some portion of the pre-calculated measures (as opposed to all underlying data entities), but this limits customers to static snapshots and the granularity associated with them.

In the end, customers are left with very narrow visibility into their businesses.

So how can that be solved?

Facts.

A fact is an event that happened in a given time, and, from a logical standpoint, can’t change in the future.

The idea that stands behind facts is called immutable data. A concept where you never overwrite any data. You always add new data entries — whether they are reads, creates, updates or deletes. You can easily get the current state of the data by getting the newest record, but on top of that it is equally easy to get the data from any point in time.

To visualise that, let’s take a brief look at Base and how deal entity progress through stages in the sales funnel. Adding a new deal to the first stage of the pipeline is represented by a new fact that states the creation of the deal. Moving a deal to a different stage is represented by storing a new fact, which holds the new attributes of the deal, and is, in that particular case, a stage ID. Editing a deal itself (which is a typical update operation) results in the creation of a new fact that holds updated attributes of the deal. And finally, viewing a deal stores a new fact that reflects this interaction too.

How is that different from snapshotting? Snapshotting is limited by its time resolution. It might be daily, weekly, sometimes hourly. But it’s very hard to take fine-grained snapshots of all data with a resolution of hours, minutes or seconds.

That time resolution defines the end accuracy of the data. If it is daily, you will not be able to reason about what happened from hour to hour. So you just lost critical insights into what was happening during the day.

Immutable data gives you “maximum resolution”, so that you see every single change, meaning you don’t lose any data.

Immutable data all the way.

It is very important to architect the software in such a way that all the data within systems is immutable. That enables in-depth data discovery. Tracking every single change to every single entity in the system means that you can now, for instance, compare two data points in time and ask what what really changed in between those periods.

The customers.

In all that, let’s not forget about the customers. There are multiple ways to get and capture the data, although only one way truly works and delivers true value to the customers. The data collection needs to be seamless to the customer, so that his productivity does not suffer and his flow with the system is not interrupted in any way.

Stay tuned for the next posting, in which I’m going to cover more operational and technical aspects of immutable data.

Show your support

Clapping shows how much you appreciated Pawel Niznik’s story.