What Datomic brings to businesses

A few days ago in the Clojurians Slack, someone asked how to express the value proposition of the Datomic database system to a non-technical stakeholder, in business terms. I was in a good position to lay out a basic answer, because I had done that exercise 18 months ago, when I convinced my partner on BandSquare that we needed a technology shift.

This post is an attempt to express the value of Datomic to a business in a more detailed and articulated form. I’ll try to express the advantages and drawbacks of Datomic as much as I can in non-technical terms, by describing the tangible consequences of using Datomic. I won’t try to support my arguments with technical details, but I’ll provide some references for technical experts to dig deeper.

As we’ll see, the business value does not lie in the ‘cool features’ — it’s about the problems you don’t have.

Expressive querying

Datomic provides high query power, which means it’s straightforward to translate a question about data to code.

Consequences: developers ship new features faster, and achieve more code reuse.

Symptoms:

- I know it sounds simple, but it will take a lot of time to write the query for it, as the data is not in a shape to accommodate for it.
- I can write this query, but it will likely be slow.

Example: at the time BandSquare used MongoDB, it was tremendously hard to write a query for “Give me the number of customers who have booked tickets for concerts of that organisation” — requiring dozens of LoCs with poor readability and terrible performance. In Datomic Datalog, it’s that easy:

Technical reasons*:

  • Datomic supports multiple paradigms for querying: logical/relational-like (via Datalog, which is as expressive as SQL), navigational/graph-like (via the Entity API), GraphQL-like (via the Pull API).
  • Datomic’s architecture enables it to provide a lot of caching, to the point I/O roundtrips are relatively rarely needed to answer a query. As a consequence, querying from a Peer is effectively non-remote, which gives developers much fewer constraints for writing their queries (from the point of view of the application, the data is “over here” instead of being “over there”). This allows Datomic developers to split their querying logic into small, simple, composable queries. In contrast, client-server databases force you to think about each query like an expedition, and you end up with bloated, single-purpose queries (or pay a price in performance and/or consistency).
  • Datomic is programmable: read and write requests are built up from data structures, not text; the data schema is stored in the same form as the data; the API is very generic, and backed by a very simple information model (see the Universal Schema, in Appendix C). This, combined with the above-mentioned performance characteristics, makes it very straightforward to build higher-level database systems on top of Datomic; for instance, it’s very easy to implement a GraphQL or CQRS architecture on top of Datomic.

No data loss

With Datomic, it’s practically impossible to lose data by accident. You can always go back to every version of your data, and know when and how it evolved.

Consequences: it’s easy to debug and recover from human or programming errors.

  • If some of your data becomes lost or corrupted, you can always go back to the version where it was present and clean (without restoring a dozen backup files). You can know when (and usually how) the data went bad.

Technical reasons*: the database is a growing set of datoms, which only ever get added, not deleted. Database values provide an asOf() operation which yields a view of the database at any point in the past. Each write to the database is annotated with the time at which it happened, and optionally additional metadata (what user is at the origin of the transaction, etc. — see reified transactions).

Straightforward, flexible data modeling

Data modeling with Datomic is straightforward, which means you don’t need to spend time wondering in what shape you should store your data.

Consequences: your system is easy to evolve. You need little anticipation of future needs to store data.

Technical reasons*:

  • The Universal Schema (see Appendix C) relieves the developer of many early-on architectural decisions they would have to make when using table or document-oriented storage (‘in what shape should I store this to query efficiently later?’, ‘should this column go in this table or should I make another table’, etc.).
  • A given attribute can be shared by many ‘entity types’, e.g you could have a :contact/email attribute shared by both your website users and your back-office administrators. This means that attributes can be very generic, and more importantly you don’t have to figure out how generic they are upfront.
  • Sparsity is much more acceptable than in an SQL database, which means you typically won’t need to make breaking changes to your schema. When an attribute is no longer needed, you don’t delete it, you just stop using it (instead of deleting a column in a table); nor will you change existing attributes, you’ll just create new ones instead and forget about the old ones. As a consequence, data migrations are fairly rare compared to SQL databases.
  • Unlike databases which have separate languages for data modeling and querying, the schema of Datomic consists of plain Datomic datoms, which you can read and write in the same way you would ordinary data.

Testing is cheap

Automated testing is an important, well-established best practice in software development. If you’re not testing your software, it’s most likely costing you dearly in fixing bugs, manual QA, and difficulties developing new features.

Database code has been traditionally difficult to test, because databases don’t lend themselves well to simulation. However, Datomic pretty uniquely supports speculative writes, which among other things makes testing Datomic-related code easy and efficient.

Consequences: testing your code is cheap, to the point it’s always worth it, even on the short term. This results in a significant increase in quality and productivity, at a very low cost.

Technical reasons*:

  • Datomic has an in-memory implementation, which means your tests need little environmental setup.
  • Datomic database values provide a with() operation, which gives a local view of the database as if some writes were applied, but without making those writes visible globally. This can be used to implement an efficient fork operation on database values, which is exactly what you need for testing globally: you can run entire test scenarios speculatively, without needing teardown phases (the GC does the teardown for you). Another way to look at it is that Datomic connections can be mocked trivially.
  • See Application Architecture with Datomic, Branching Reality and Full stack Teleport testing with Om Next and Datomic

Reproducibility

A key factor for solving bugs rapidly is your ability to reproduce the circumstances of a bug in your local environment. Datomic has 2 features that make it very easy:

  • you can obtain any past version of the database instantly (without needing a backup!)
  • you can ‘clone’ your production Datomic instance instantly on your local machine, which essentially gives you a local reproduction of your server.

Consequences: you can instantly reproduce your production environment locally, which makes for faster debugging. In addition, if you need to apply manual corrections to your data, you can safely dry-run patches before applying them in production.

Technical reasons*:

  • the ability to fork a database (see above)
  • the ability to obtain a past version of the database (aka db.asOf(t))

Integrating other data systems

As a system grows, it usually requires adding new types of databases to satisfy diverse querying needs. For instance:

  • Maybe you need some business insights and want to add a data warehouse like BigQuery.
  • Maybe you need some search capabilities and want to add an ElasticSearch instance, etc.
  • Maybe your website is slow, and you need to make some computations ahead-of-time and store them in a cache such as Memcached or Redis.
  • Maybe you need to notify some of your partners of events, and want to add a real-time webhook system for that.
  • The client application (e.g a browser or mobile app) may need to maintain a local cache of data instead of keeping re-fetching from your web server.

You usually have a ‘source of truth’ database, and several other ‘derived’ databases, at which point data synchronization between all those databases becomes an important issue. The key to data synchronization is the ability to answer ‘What changed?’ questions.

This is hard to do with most traditional databases (which are designed only to answer ‘What’s there?’ questions), but it’s trivial to do with Datomic thanks to its ‘log’ structure: if you want to know what changed since last Monday, you just read the log since last Monday!

Consequences: if you start a project with Datomic, it will be easy to send your data to other, complementary data systems in the future.

Technical reasons*:

  • The log-like structure of Datomic. The Log API literally gives you the exact changes, at the finest granularity, between 2 points in time.
  • In addition, the Transaction Report Queue gives you a real-time notification mechanism of all the writes in the system, which you can then feed into a streaming framework, message brokers, webhooks, etc.
  • See How Cognician uses Onyx for an example.

Great performance characteristics for the most common use cases

It’s hard to describe the performance aspects of a database system in business terms, but they do become a business concern when the engineers spend their time coping with the database load instead of building new features, or when they have to compromise on the user experience of the product for database performance reasons.

Symptoms:

  • the whole website slows down or crashes when there’s a traffic spike
  • “please, don’t run any export during high-traffic hours”
  • “yeah, the info on that page can be outdated sometimes, please wait a few minutes ’til the cache refreshes”
  • engineers spend time dealing with caching code (which is knowingly difficult and error-prone)

For the most common use cases, Datomic exhibits very interesting performance characteristics. By “the most common use cases”, I mean the ones for which 95% or systems are built (e-commerce, enterprise systems, content management, user management, etc.), most of which have similar database needs:

  • the users read (e.g browse content) a lot more than they write (e.g buy something, subscribe to a website, change their preferences in an app, etc.)
  • the reads must be fast and always available (the users cannot wait to receive their content), whereas the writes can take a little longer (it’s okay if signing up to a website takes a few seconds)
  • the data is medium-sized (Gigabytes, not Petabytes)

It turns out the architecture of Datomic makes it a great fit for those needs:

  • the reads can take a virtually infinite load (it’s only a matter of spinning up more machines)
  • even when the system is overwhelmed by too many writes, reading remains available. For instance, on an e-commerce website, users may become temporarily unable to place orders, but they will still be able to browse the catalog, instead of seeing a “Sorry, our website is temporarily unavailable” page.
  • Datomic provides transparent caching, which makes the reads very fast — and unburdens your engineers from having to write complex caching code themselves. It’s like having a Content Delivery Network (CDN), but for data instead of web content.

Consequences: engineers don’t have to think much of the performance or operational aspects when building the system, and can focus on business features instead.

Technical reasons*:

  • Traditional databases stores data in mutable cells, which forces them to use locking as a technique for coordinating reads and writes. In contrast, Datomic stores data in a single immutable persistent data structure, and this property of being immutable has interesting implications on performance and scalability:
  • Immutability allows for lots of caching, and loading data lazily where and when it’s needed. In particular, the Datomic Peer Library lets you cache data transparently in the application process or in memcached clusters, without having to write a single line of code: as you application queries a given bit of data, it will fetch that bit of data (and a neighborhood of it) from storage, and store it in the cache; in the end there will be much fewer I/O roundtrip than actual queries, as most of them will just hit the cache.
  • Reads and writes are completely independent (they don’t happen on the same machine), and reads are independent from each other. In particular, writes don’t slow down reads. No locks!
  • Writes are run one after the other in a single process, the Transactor. You’d think that’s a big scalability issue, but it turns out you can go a long way with a single writer thread once you’ve separated reads from writes; this strategy is notably used by modern SQL engines like VoltDB. You can get High Availability by using a fallback Transactor process.
  • As a consequence, Datomic is fully ACID, with a serializable isolation level.
  • Applications processes send writes to the Transactor, but read from storage (storage is the service a Datomic deployment uses for persistence; it’s usually an instance of AWS DynamoDB or Cassandra, but it can also be a plain old SQL database). A Transactor may crash under write load, but then storage will still be fully available.
  • People often worry that immutability will cause storage size to explode. It won’t. For the vast majority of systems, even when they use a mutable database, storage size is proportional to activity, because most of data is created, not updated, and almost never deleted. This will still be the case with Datomic, although the factor may be a bit higher, which is not a big deal as storage is very cheap nowadays. As a bonus, because of immutability, you need only keep one backup of your database, which contributes to reduce the total storage size you’ll need. Interestingly, similar concerns were raised when Git was introduced as an immutable way of versioning source code, but it turned out that this isn’t much an issue in practice.

Community support

The Datomic community is small, but thanks to the enthusiasm of its members, it’s very responsive and welcoming. If you need help, you can just go to the Clojurians Slack, the Datomic mailing list, or StackOverflow, and you’ll get answers from experienced users, some of whom are very smart.

Consequences: your team gets expert advice for free.

Some of the hardest problems with databases are gone

This more philosophical section attempts to take a step back, reflecting on deeper database-related issues which undermine everyday software engineering.

Even though databases have existed for decades, data management is not a solved problem at all — as shown by the fact that mature and popular database systems likes MySql or PostgreSQL continue to issue releases with major new features.

There’s a handful of well-known, hard database-related problems that bite developers even in the most common use cases:

There’s no immediate solution for these problems, and I don’t believe there will ever be — they’re just inherent to the fundamental choices of traditional databases systems. But these problems simply don’t arise with Datomic — precisely because Datomic makes different fundamental choices (see Appendix C).

Consequences: developers can focus on business logic, instead of fighting the incidental complexity of their database.

Technical reasons*: essentially, this is enabled by 3 fundamental choices of Datomic — immutable storage, non-remote querying, and the Universal Schema. These are capabilities with far-reaching consequences for databases, just like Garbage Collection has far-reaching consequences for programming languages. Backing up these claims with technical evidence would take us too far here, but here are some insightful references:

Drawbacks and limitations

I’ve made a very pretty picture of Datomic so far, right? But I wouldn’t be intellectually honest with you if I told you to use it for any project. Knowing the limitations of a tool is just as important as appreciating the benefits you get from using it, so I’ll try to depict these too.

Human Resources: although Datomic is relatively easy to learn, it will still require a shift in mindset for people used to relational databases. Therefore, your software team should be willing to learn it, and use it for projects which require intensive development, not just occasional maintenance (in which case Datomic is probably overkill).

Big Data: Datomic is not a good fit for storing huge amounts of data. This does not mean a Big Data information system cannot use Datomic, only that it cannot use Datomic for everything. You’ll typically store critical data which is most often involved in business logic in Datomic, and the rest of it in complementary stores (S3, HDFS, Kafka, etc.). At BandSquare, we store at least 10x more data than what we use in Datomic!

Commercial database: any serious, business-critical production deployment of Datomic will need to buy the license. This is completely worth it for most companies (especially since you can start for free), but for some individual side-projects it may be more problematic. And of course, a commercial license is more constraining than open-source.

Infrastructure footprint: a minimal deployment of Datomic requires at least a few Gigabytes of RAM. Hosted on the cloud, it will cost at least a few 10s to a few 100s of dollars per month, which can be problematic for small side-projects.

Experience report: Datomic at BandSquare

So far, I’ve only described abstract general properties of Datomic; here’s how these have translated to building BandSquare’s platform, in concrete facts and numbers.

We started building a first version of our product in early 2014 using a stack that had a lot of hype among startups at the time — NodeJs and MongoDB. Over the course of 18 months, our business focus shifted from B2C and user-centric to B2B and data-centric, and our technical requirements evolved from small website prototypes to a whole data management platform providing advanced analytics and data visualization in addition richer set of consumer-facing features, with much more sophisticated business logic — all of this with a 2-developers team.

At the end of 2015, our technical situation was very difficult:

  • we spent 30 to 50% of our time fixing bugs, mostly because of the absence of tests and the difficulty of reproducing them
  • the most simple changes in requirements for aggregating / business rules queries — the kind which we needed more and more frequently — took us hours to implement and performed terribly, mostly because of the lack of an expressive relational query language (we essentially ended up writing an ad hoc, client-side relational engine for any involved query, fetching many documents from different places and indexing them locally)
  • while facing these issues, we had to take the product to whole new level with no hope of obtaining more human resources.

So we decided to migrate our server-side code from NodeJs and MongoDB to Clojure and Datomic. In 4 weeks, 13k lines of hacky, untested JavaScript code turned into 9k lines of well-tested Clojure code (plus 3k lines of tests).

After migrating, our situation drastically improved, mostly because of the above-mentioned properties of Datomic (although Clojure’s interactive development story also played a significant role in increasing productivity and quality):

  • the time we spend fixing bugs has gone down below 5%
  • we code in a Test-Driven Development style, mostly with system-level tests (as opposed to unit tests); all of our API endpoints are tested, and we have satisfying test coverage of the whole codebase.
  • The size of our testing code is only about 30% the size of our implementation code , which is unusually low for a TDD setting — according to my research, this figure is usually between 150% and 300% for TDD. In other words, we achieve a very good ROI from writing our tests. My analysis is that Datomic enables us to write mostly system-level tests instead of unit tests, and system-level tests yield much more coverage than unit tests do for a given amount of engineering effort (Don’t get me wrong, I’m not saying you shouldn’t write unit tests — in a perfect world you should be able to write both kinds of tests; what I’m saying is: if you have a tight testing budget, you’ll want to cut down on unit tests first).
  • Our codebase has now grown to about 30k LoC, with no decrease in productivity and quality — rather the opposite, and we’re still a 2-developers team. Datomic’s flexible data modeling has helped us quite a lot make our code more reusable.

Some productivity numbers:

  • Implementing a simplified Google Forms-style surveys system, with visualization and exports of results: 7 days, 600 LoC
  • Implementing a generic, graphical CRM tool, which enables campaigns admins to segment their audience in terms of user actions (example: “select any user who subscribed to campaign A or campaign B, and answered question C with option D, and who has not received newsletter E”), and compiles these segments to Datomic queries for visualization, export and bulk emailing: 20 days, 3000 LoC
  • Writing an ETL pipeline which incrementally syncs our Datomic database to an ElasticSearch materialized view: 5 days, 1000 LoC
  • Writing the “change detection” code for the above: 2 hours, 100 LoC
  • Modifying all our write endpoints to annotate each database write with the user who originated it: 3 hours, 50 LoC
  • Manually recovering from an accidental deletion of a client account including hundreds of contacts: 30 min (no backups involved)
  • Reproducing the production environment on a local machine with local “Modify and Undo” capabilities (typically for debugging or demos): instantly

Summary

Datomic has a combination of special features which offer a lot of leverage to developer teams: productivity, ease of debugging, ease of testing, auditability, extensibility… but my favorite feature of Datomic is that I don’t need to think a lot about it when I program: it just lets me focus without getting in the way:

  • When I write data, I don’t need to anticipate how I’ll query it or how its schema will change;
  • When facing bugs, I’m confident that I won’t lose information and I’ll be able to reproduce and fix the problem in isolation (or prevent it with testing);
  • When I query, the data always feels near at hand and intelligible.

Once you get used to them, these capabilities seem like a given, because that’s what databases should feel like — I only realize how spoiled we are when going back to old-school databases, or seeing other teams struggle with them. If you wonder how impactful this is, there’s a similar historical precedent: ask experienced developers how it was to move from code in files to version control systems.

Appendix A: What do we expect of a database system?

A database system provides 2 primary operations:

  • storing data, which means not only ensuring that you store is saved durably, but also that the data you save is correct according to some business rules. Also called writing, persisting.
  • querying data, that is, not only retrieving the data that was saved, but also answer questions about it. Also called reading.

Appendix B: What’s the use case for Datomic?

The short answer is: the most common use case of databases in IT, the one for which people use MySQL, PostgreSQL or Oracle. Don’t use it for applications that only require very ‘dumb’ storage or for quick prototyping, that would be overkill!

Appendix C: What makes Datomic different?

Datomic is one of the very few pieces of technology I’d call revolutionary — trouble is, I have never met a database vendor who doesn’t call her product revolutionary. Here are some tangible elements that make Datomic stand out among database systems.

Datomic has 2 fundamental differences compared to mainstream database systems.

The first difference lies in the way Datomic stores information. Most databases work essentially like a slate, where a new piece of information is added by finding a place to write it, oftentimes by erasing an older piece of information. In contrast, Datomic works like a log (By log, I don’t mean a text file written by a web server; I mean the kind of log sailors write during a voyage to record what happens every day), in which every new piece of information is appended without touching the information that was previously written. Developers will refer to this as Datomic ‘immutable’, ‘accumulate-only’, or having the ‘Database as a value’ property.

The second difference lies in the way Datomic represents information. Whereas a mainstream database stores its information in tables or documents of various shapes, Datomic only represents information in the form of small units of data of similar shapes, called datoms, which represent facts. This uniformity of data representation is called the Universal Schema.

It’s not obvious why these two fundamental characteristics of Datomic are useful; but they’re actually enablers for other, more desirable properties (listed above). This means that the other databases systems that haven’t made these fundamental choices cannot achieve these desirable properties, not matter how many million engineering hours have been spent on them.

Appendix D: Who’s telling you this?

When it comes to choosing technologies, you should only ever listen to comparisons drawn by people who have given a fair try to each alternative. Hopefully this section will convince you I’m one of those :)

The bulk of my experience in IT comes from my job as CTO of BandSquare, in which I’ve had the chance of tackling a relatively wide spectrum of technical problems — from UI to web application backend to data analysis — using a variety of software stacks through several versions of the product: Scala/Play, NodeJs, Clojure, MongoDB, ElasticSearch, Postgresql, and of course Datomic. Prior to that, I’d been programming software for a few years through a variety of IT internships, side projects and school projects, during which I’ve had the chance of using Java, JavaEE, and bits of Ruby on Rails and Python, backed by SQL Server, MySQL, and Postgresql. You may find me on the web on Github, StackOverflow, and LinkedIn.

Note that I’m not affiliated to Cognitect — the company stewarding Datomic — in any way, except by being Datomic customer (still on the Datomic Starter free plan, I’m a bit embarrassed to admit).

Thanks to Chloé Julien, Baptiste Dupuch, Pauline Vialatte, Nathan Skrzypczak, and Benoit Cotte for helping me on drafts of this post.