Safe payments using Sagas

Elijah Kim
Frame.io Engineering
4 min readJul 17, 2018

Recently I was tasked with migrating our billing infrastructure from one plan per subscription to supporting a plan with additional line items per subscription.

Setting the scene

We have an internal database and the payment provider that we use and need to ensure that both sources were always in the correct state. For example, if the payment service went down when a user was trying to update their plan, we had to ensure that our database didn’t update with the new plan while our payment service remained stale. In the end we implemented Sagas using the brilliant library that Andrew Dryga created called Sage. But before we get into the solution, let’s talk about how our code used to look.

The State of the Union

Our original code for changing a plan looked something like this.

The problem here was that if :update_plan failed, we would have no way to rollback our Payments operation. It wasn’t the biggest deal however because the :update_plan function was the last one called so if it failed, it was a simple database rollback. With the new requirements however, this had to change.

Let’s get started

The general steps we needed to take when adding or removing a line_item to a subscription were these:

  1. Validate that the user can make the change.
  2. Validate that the line_item is a valid addition to the subscription.
  3. Add a record to the subscription_line_items table.
  4. Update the subscription cache with the new limits.
  5. Tell Payments about the change.
  6. If any of these fail, rollback everything done so far.

Using Ecto.Multi we could write out something similar to what we did for update_subscription/3

However, the problem of rolling back (or compensating) Payments when something in the chain fails still existed.

A New Hope

Enter Sage

Sage is a dependency-free implementation of Sagas pattern in pure Elixir. It is a go to way when you dealing with distributed transactions, especially with an error recovery/cleanup.

…which is exactly what we needed to solve our payments issue.

Sage's API isn’t very different from Ecto.Multi's either. A direct translation of the code above to Sage looks like this:

I won’t go into Sage’s API too much. The documentation is pretty great and I suggest you read it. Anyway, we could clean it up a bit by doing something like this.

However, this still doesn’t fix the rollback problem. Let’s fix that.

There are more intricate compensations that you can implement and I encourage you to read about it here: https://hexdocs.pm/sage/Sage.html#t:compensation/0

Close but no cigar

This looks pretty great, but falls apart when we think about changing plans.

Why? Changing plans itself is pretty complicated. Here are the steps we need to take for a successful plan change:

  1. Validate the user can actually change the plan.
  2. Charge the user for the new plan.
  3. Migrate over the line items for the old plan to the new plan (hopefully we can use the function we created above to help us. Spoiler alert — we can. Sort of).
  4. Rollback all the changes if anything fails along the way.

However, in the state that our codebase is in at the moment, our sagas aren’t composable. Let’s fix that by creating a Sagas module that holds functions that take sagas and return sagas.

Now you can imagine the case where we can use the previous pattern to compose Sagas for changing plans. Here’s what the first iteration could look like.

It’s pseudo code but you get the gist. Hah. There’s an optimization you can do as well where you can split out the Payment calls and run them all in an async block. I’ll let you figure that out yourself. 😉

We made it!

We’ve finally achieved what we wanted to. We created a beautiful function that can run both async and rollback if anything fails along the way. On top of that, we created a pretty nice pattern to extend and if requirements change, modify and compose. I was pretty proud of it myself when I finally got here.

With that said, I have a few nits. First is that there are multiple places where “state” can come from. Sage.transaction takes a params argument but also you can use variables from outside of that transactions. It’s also impossible to modify params during the lifecycle of the transaction so even if you wanted to unify your state by the updating the params map you couldn’t. There are also times where you can’t know what information you’re going to need later on so you’re stuck using function arguments leaving you with two sources of truth.

Another nit is that any function in the Transactions module that I created requires a specific match against the transactions that have already run and the params map. I think this is fine for me (the person who wrote the code) and I’ve tried my best to document all the functions in there but, there’s a lot of complexity involved with using the Transactions functions at the right time for other engineers to pick up on. I don’t know what the solve is here but there must be a better way.

In closing, this is an awesome library and definitely cleaned up our business logic, especially with payments. I’m hoping we can eventually wholesale replace Ecto.Multi. Huge shout outs to andrew_dryga (I’ve linked his twitter this time) for crafting this library and open sourcing it. It’s some great stuff and I believe that it’ll be here to stay. If you have any questions, feel free to ask and I’ll get to it as soon as I can.

👋 like what you’ve read? We’re hiring!

At Frame.io, we’re powering the future of creative collaboration. Over 500,000 video professionals use Frame.io to seamlessly share media and gather timestamped feedback from team members and clients. Simply put, we help companies create better video, together.

Across the stack we’re big users of AWS Lambda, Elixir, Swift, Go, and React. We’re a small, polyglot team that thinks big and works collaboratively to solve the biggest challenges for our customers that include Vice, BuzzFeed, Turner and NASA.

--

--