Choosing Elixir for Shedul’s tomorrow

Shedul is a complex Ruby on Rails system already working at quite a serious scale — check out my previous article for more on that. With the expansion of Shedul features and with introduction of Fresha — an entirely new product that was supposed to work tightly with the existing system — we’ve decided that it’s time to go for microservices.

With that however a question was raised about a leading backend technology to pick for new services — one that would fit our business, productivity, operational and scalability needs. As a CTO I was happy to take a leading role in the complex process of researching, evaluating, choosing and introducing the new into our cozy yet shrinking world of Rails monolith. Here’s the story.

Objectives

Here’s a bunch of basic assumptions that I’ve followed:

  1. As mentioned, we wanted and needed to go for micro-services, so the technology would need to have this approach in its DNA.
  2. As Shedul is past the (very successful) initial startup market entry stage, we wanted a future-proof solution that will work at scale.
  3. Among top upcoming features on the roadmap, almost half had to do with real time, so we needed a system that embraces WebSockets.
  4. With an existing team proficient in Ruby on Rails development, we needed a solution that they can approach and stay just as productive in.
  5. With an existing, mature ecosystem of application code, packages and dependencies, we needed a solution that will plug into all of that.

This is what landed on my plate about a year ago. But let me jump a little further back first — to early 2016.

Research

As an experienced Ruby on Rails developer, I’ve been long enjoying the perks of the framework — the productivity at project’s early stages, the beauty and flexibility of Ruby’s syntax and the rich community full of pragmatic developers and mature open source. At Shedul in particular, we have successfully used it to deliver some amazing features in an equally amazing timeframe, putting Shedul in the position it is right now.

But the project grew and it started to face all the typical problems of Rails monolithic apps — among them, performance and scalability issues described in an excellent “Rails Web-Scale is Expensive” article, with which I completely agree. But there are other problems that go deeper into the code itself. More on that later.

Then came Elixir. After reading some books (among them, the greatest — Programming Elixir) I got instantly hooked to the sheer syntax beauty, data safety, concurrency, performance, maintenance and more. But there’s obviously more to choosing a whole new technology — even for personal use, not mentioning large production systems — than just being hooked by something cool (otherwise, JS developers would all have gone nuts by now, considering the amount of next big things per minute there).

So the research process began. I’ve started a new blog — Phoenix on Rails — on which I’ve evaluated each aspect of Elixir/Phoenix stack as compared to Ruby/Rails combo. By the way, the blog itself is also written in Phoenix. I’ve made it my personal quest to answer all the key questions, among those:

Some of these pieces landed in Elixir newsletters and gained attention of community leaders such as José Valim or Saša Jurić, giving me a sample of the kindness and liveliness of the young community.

This was all finalized by running a real life workshop at Visuality, which allowed me to get a basic grasp on how hard Elixir is for Ruby devs — a key question from Shedul’s perspective. Having that, I was quite ready a few months later to establish a competent list of reasons and challenges for picking Elixir for Shedul and finally making the call.

Reasons

Reason 1: Productivity

Elixir and Phoenix both advertise as technologies that don’t sacrifice productivity. And they really don’t, thanks to the simple yet beautiful syntax akin to Ruby’s as well as to similar level of abstraction to Ruby/Rails. Implementing web, business logic or tooling code takes similar number of lines in Elixir as in Ruby.

Moreover, Elixir+Phoenix improves upon Rails in some of these aspects. Here’s a couple of examples:

  • Elixir compiles the app and catches lots of errors and bugs early (this enhances not just productivity but also quality)
  • Phoenix app’s test suite runs order of magnitude faster than with Rails — which is extra important with database-heavy test suites
  • Elixir has no monkey-patching which is a common root of bugs and misconceptions in Ruby and Rails apps/gems
  • OTP gives ready-for-use tools for coding and maintaining processes — something that requires effort (+ 3rd party gems) to pull right in Ruby
  • functional code that operates on immutable data is easier to reason about, which makes it easier to modify someone else’s code
  • code structure encouraged by Phoenix 1.3 and data flow introduced in Ecto 2.0 make it easier to structure, implement and extend complex apps
Replace “compiling” with “running tests” — that’s how Rails dev spends time developing large apps.

Elixir also has a bunch of productivity perks when compared to other languages such as Go (as described in “Go concurrency considered harmful”) or Python (which has similar limitations to Ruby described in “Why we switched from Python to Go”).

Reason 2: Performance

As reflected by the speed of test run time difference mentioned above, Elixir smashes Ruby to dust when it comes to performance. In particular, the cost of instantiating ActiveRecord objects is something that sooner or later paralyzes every Rails project, either by huge production memory requirements or by slowing down test suite to the point where you start removing useful tests and/or mocking everything which makes tests less reliable.

One particular gain is the ultra-small cost of processes that are a foundation for Elixir’s concurrency. Elixir is comparable to Go in this regard, although the concurrency model is arguably easier to work with.

Reason 3: Real-time

Ok, so Elixir is fast and processes are cheap — so what? When it comes to classical HTTP servers, you may get away with slow technology like Rails backed by aggresive horizontal scaling, but that’s not the case with WebSockets. The need to maintain an ongoing socket with every connected user at scale requires a truly performant concurrency model. With Rails’s ActionCable being nothing short of a joke as far as WebSockets production scale is concerned, this is IMO one of main Rails developer motivations for looking into alternatives these days.

An accurate depiction of ActionCable max client connections in typical production env.

The delicate and non-compromising nature of WebSockets is one of reasons for Node’s popularity — and Elixir easily outperforms it in this area, partially by being run on a VM that efficiently uses all the processor cores. Heck, we’re talking millions of live connections handled by a single server here.

We have lots of features that will need WebSockets on the roadmap and with the choice of Elixir we’ll now be technically ready to push them our users’ way with a gold standard performance at scale.

Reason 4: Operations

Thanks to immutability, OTP tooling and unique concurrency model, Elixir apps are not just performant and easy to understand, but also well-behaved on production. Here’s why:

  • immutability and concurrency model reduce the risk of memory leaks, which (speaking from experience) are particularly problematic and hard to debug in Rails production systems
  • OTP offers tooling for and allows better control over how specific parts of the system react to failures of 3rd party services or internal subsystems, with early exception throwing being an encouraged practice
  • Elixir and Erlang were designed specifically with micro-service architecture and inter-connectivity in mind — as reflected by presence of the umbrella project pattern and EVM’s cross-node communication

Combine this with Erlang’s three decades worth of production experience in hosting highly available services for a complete picture of possible operational improvements over Rails.

Challenges

Challenge 1: Team

There are almost no developers specialized specifically in Elixir on the market. This could make one doubt if it’s possible to build a team coding in it, especially for an aggresively growing startup project. In theory that seems a viable problem, but it doesn’t stand in line with my programming experience.

Most proficient developers that I’ve met have not been constrained to one specific language. More so with Ruby — in my opinion it’s particularly risky to hire Ruby-only senior developers since they often tend to blindly follow the (often harmful) “Rails way”. Instead I’ve learned to value those developers for whom the language in only a tool and they are able to choose from one of a few in order to deliver product in technology that best fits specific problem. Such developers are usually happy (and efficient) to learn new language.

Also, Elixir has some unique advantages here. The syntax is definitely appealing to learn and pleasant to work with. The functional programming methodology is something very popular these days (think Scala), which gives a promise of talented students eager to learn and use it. The immutability on top of that completes the picture of a language that’s equally appealing to new and to existing developers, often tired of the usual OOP mess. This is all backed up by Elixir being ranked as 7th most loved language.

Finally, my battle tests mentioned above showed that almost every Ruby on Rails developer quickly and happily switches to Elixir with minimal introductory overhead and without long-term productivity drop.

I once more encourage to read the excellent “Why We Switched from Python to Go” article, in particular the “The Ability to Build a Team” section which also touches this subject from the perspective of Go.

Challenge 2: Ecosystem

Having been used to tons of useful Ruby gems, one should consider if the young Elixir package ecosystem will be a viable replacement. The number of Elixir packages indeed is incomparably smaller. Still, there are packages to tend to almost all dependency and base functionality needs that we have in our system and its services, including stable packages for PostgreSQL, Redis, RabbitMQ, Protocol Buffers, geolocation, database queries, hashing algorithms, performance monitoring or error tracking. Phoenix and Ecto offer every single feature that we needed from Rails and ActiveRecord, in most cases vastly improving upon and taking lessons from the Ruby equivalents.

In no small part this is thanks to the access to a rich and mature ecosystem of Erlang’s packages which can be used from Elixir. There’s less and less need to invoke Erlang code directly since for every useful Erlang package there’s usually an Elixir wrapper package available on Hex.

There are some areas for which one could find a Ruby gem but not an Elixir package. But this may actually be a healthy thing. Rails projects often overdo with relying on open source — I rather believe that such packages should be used moderately in order not to lose control over what happens under the hood and not to end up spending more time around customizing existing solutions than it would take to implement an own custom tailored code. That’s why I believe that — in its current state — the Elixir package ecosystem offers a sweet spot of ready-made parts with a bit of a push towards implementing something from scratch every now and then.

Thankfully, the community — driven by José Valim — is a particularly helpful and vivid one as everyone can see for themselves on the Elixir Forum.

Challenge 3: Interoperability

Having an existing mature Ruby on Rails monolithic application, we had to ensure that new micro-services will have viable means for communicating both ways with the existing code. We’ve decided to go with a custom made RPC solution based on Protocol Buffers, which played out rather well.

We also had to ensure that our existing external dependencies — such as PostgreSQL, Redis, RabbitMQ, Sentry or New Relic — can be used from Elixir just as well as from Ruby. This was not a problem in most cases, except for New Relic which has no official support for Elixir and community packages are not functional. This made us switch to AppSignal which seems to have the best Elixir package out of all monitoring platforms that support it.

Summary

We’ve been developing in Elixir for a few months already and we already have Elixir apps deployed and bangin’ on production. There was a bunch of technical and operational blocks that had to be removed every now and then (which I hope to write about soon) in order for the technological switch to happen swiftly and for initial tech debt to be minimized, but so far the new technology meets all our expectations.

Stay tuned for more stories from our Elixir battleground!

Oh, and here’s a bunch of other interesting reads that may be helpful for those who, like me a few months ago, are considering Elixir for their tech of choice:

One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.