My “Haskell In Production” Story

This is the story of the problem that led to this tweet, some challenges I faced along the way, as well as the outcome. More experienced Haskell programmers may shrug because they know what the language is truly capable of, but these are the types of challenges that commercial users of Haskell have to deal with and overcome, especially when betting on the language for the first time.

There intentionally aren’t many technical details in this post. I may write up a few deep dives separately.

Unlocking Business Data

Salesforce is the business system of record for much of my company’s data. That makes it an important system but unfortunately not an easy one for our own SaaS platform to interface with day-to-day. Fortunately, most of the time our backend developers just need to search or read the values of a few Salesforce objects. Bi-directional integrations, those that require reading from and writing to Salesforce, are rare. Even still, untangling the Salesforce APIs, dealing with authorization issues, and handling integration errors can feel like a huge burden when all you want is a little bit of data.

It is also interesting that most of the time we don’t need live data. We need a reasonably up-to-date view, but if this lagged Salesforce by a few minutes that would be fine. This intuition suggested that we could look for a reliable way to replicate data from Salesforce into our own backend. Given a read-only replica of Salesforce data in, say, a PostgreSQL database, we could query and work with the objects in a more natural way. Our backend developers would still be exposed to the Salesforce data model but they could stay focused on the business logic within their own code. They would not have to worry about the technical details of API integrations. This reliable data replication capability would just become a foundational building block in a larger solution.

As simple as it sounds, there are still surprising complexities in building this.

In order to replicate Salesforce data, we need to use two of their APIs: the “async” API and the REST API. The former provides batch-style access to bulk data, useful for initial replication or rebuilding replicas after critical errors. Among other things, the latter provides near real-time notifications of updated and deleted records, useful for keeping replicas up-to-date.

Both APIs use security tokens for authorization and these must be periodically obtained via OAuth2.

Both APIs return JSON representations of the underlying Salesforce objects, but, as we’ll see, there are some subtle differences between the two representations and these have to be reconciled.

Our backend developers may not need live data but they do need confidence that the data is correct. Replication error handling needs to be well thought out.

Finally, systems tend to break at their integration points so we need good monitoring and visibility into status and whatever errors do occur. This integration is an important production system in its own right.

Why Haskell?

Golang is our de facto standard backend language. I made this choice a couple of years ago. We standardized on Go before starting our SaaS development. All of our backend services are written in Go and it is a popular choice in other areas, e.g. for CLI tools and utilities. Go would normally be the automatic choice for a problem like this. As much as I’m not a fan of the actual language, I admit that it is easy to be productive in the Go ecosystem and when thinking about this problem I felt I could see the solution in my mind’s eye. It would be no big deal to write and maybe kind of boring.

But I’ve had an interest in Haskell for a long time, for over a decade. I got interested in Erlang around 2006, discovered functional programming, and read somewhere that, “if you really want to understand functional programming, then you should look at Haskell”. So I did, and I never went back to playing with Erlang. I’ve used Haskell for personal projects but never at work.

I think Haskell is an interesting tool for a variety of problems. The story has improved greatly since 2006 and the language is now a viable, although niche, choice for commercial use. When I had the chance I was sorely tempted to standardize on Haskell over Go, but for a variety of internal reasons I could not responsibly make this choice (this wasn’t just a choice between those two languages — I also considered a couple of others before ultimately going with Go).

I’ve been keen to give Haskell a try for a “real” project, so when this problem showed up it seemed like it might be an opportunity to make the work more interesting.

At first glance, this problem is not an obvious fit for Haskell. It is very IO-oriented: making OAuth2 requests, moving data around from one system to another, implementing retries, and handling various forms of errors. This is the kind of “real world” code that jams people up in the language. But I’ve also heard enough times that Haskell can be a tremendous asset in the real world so I decided to give it a try.

Yes! Haskell Can Do That!

During development I was stretched well out of my Haskell comfort zone, but I never regretted the choice. Not surprisingly, IO- and error-related issues were the biggest challenges.

True to what I’d heard, Haskell did turn out to be an asset rather than a liability:

These last two problems in particular made me glad that I’d chosen Haskell for this project. I’m sure they could have been solved in Go, but the elegance and concision of the Haskell-based solution is impressive.

Outcome

The code has been running in production for about a month now, replicating data from Salesforce to PostgreSQL, and we haven’t had to touch it once. This kind of operational reliability is typical of what we expect and enjoy with our Go-based projects, so I consider that a win.

I did not keep a journal of my time spent on the project, but from my commit history I conservatively estimate that I spent about three person-weeks actually coding (I was the sole developer). That time included some exploration and problem solving in the Salesforce APIs, which of course had nothing to do with Haskell. I have no reliable way of estimating how long the project would have taken if I’d used Go instead. My gut feeling is that it would have taken about half the time but I have no way of validating this.

Why so much longer? As an experienced Go developer I don’t need to spend much time thinking about how to solve problems in that language, and the opposite was clearly true in Haskell. I don’t use Haskell day-to-day and natural solutions aren’t on the forefront of my mind. To be fair, I haven’t looked for an incremental JSON parser in Go, and if one doesn’t exist that would certainly have evened things up a bit.

I’d also like to acknowledge that absolutely key to my success was access to a local community of Haskell enthusiasts. I live in the Research Triangle Park area of NC where the Haskell community is small. However small, this group of people gets together regularly and they were vital in terms of their technical knowledge as well as for staying motivated.

I got through it, and, of course, I learned a great deal. If I had it to do over again, or if I had to solve a similar problem I’m confident that development would much less time. This is a Haskell theme that I’ve heard before and I’m looking forward to putting it to the test.

Unfortunately I cannot open-source the code from this project, but I can write about it. I have a few posts in mind, but if there’s something specific you’d like me to unpack, please let me know!

Reformed Executive, Software Architect, and Product Person

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store