re-engineering
Published in

re-engineering

idempotency

Idempotent is a word used to describe a system component that has the characteristic of “having the same outcome regardless of repeated action.”

That’s my definition anyway.

Officially, from googling, it is:

Idempotence is the property of certain operations in mathematics and computer science whereby they can be applied multiple times without changing the result beyond the initial application. — Wikipedia

Close enough!

So, an example from math would be the operation of adding zero to zero:

0 + 0 = 0

You can repeat that addition over and over again and you’d still be at zero. Depressing, right? Nope, just idempotent! (ha ha.)

source: wikipedia

On first glance it may sound like something without practical use to software engineers, especially since it has such a funky name, but idempotency is an important and practical concept to have a handle on.

Let’s look at an example.

queues and jobs example

Say, for example, that you have customers placing orders on your ecommerce site.

Now, because your business is operating in Germany, it will need to comply with strict regulations around online transactions.

Let’s say that one such regulation entails electronically registering every sale within 24 hours of making that sale, and that the German government has approved a list of API endpoint providers, and that your back end must ping one of them to register those sales.

Because you don’t want your site visitors to wait, you build a queue system (say, with Redis) and add a job to it with every confirmed order placed on your store. So far, so good.

But because of the scale of your store (~10,000 orders placed every day), you have multiple consumers taking jobs off the queue to process. Sounds like everything would work fine.

However, when your CEO received the tax report from the German tax authority a few months later, she was taken by surprise that the taxable amount is 20% higher than your actual sales from the last year. What happened?

Naturally, she wants to find out, so she turns to you, Mr. Engineer. What the heck happened?

You crack open a bottle of mate and start searching through your logs. An hour later, you realised something: about five percent of all sales made have been reported to the government-approved API endpoint more than once.

Oh shit, you think to yourself. Better find out what happened and stop the bleeding.

How did this happen?

Turns out, Redis doesn’t guarantee that a job is only ever dispatched to a single consumer (I don’t know if this is true, but let’s pretend it is for illustration purposes). What does that mean? Think of this timeline:

  1. Person A places an Order XXX
  2. your back end adds a “register sale with government API” Job to the queue
  3. a few minutes later, the Job gets picked up by Consumer 1, but it also simultaneously gets picked up by Consumer 2 before the Redis queue broker was updated that Consumer 1 had already taken it
  4. Consumer 1 pings government API for Order XXX
  5. Consumer 2 also pings government API for Order XXX

Kaboom. Your system just double-reported on a single sale. Hence, double tax. This happens for 20 percent of your orders because why?

Because your jobs are not idempotent!

one of many potential solutions: storing internal state

One way to fix this is to implement a shared key-value store and have each job start by finding or creating the order ID key and setting its value to “processing”. For example:

jobsBeingProcessed = {
orderIdXXX: 1
}

With this in place, when Consumer 2 accidentally picks up the same job that Consumer 1 had already picked up, it would be able to detect that there is already an active consumer working on that job, and it can perhaps exit and pick up a different one, saving your company some additional taxes.

There are other solutions, I’m sure. I just wanted to share one to give an idea of how idempotency can be achieved. Storing internal state actually doesn’t seem like a great idea. Stateless systems are usually less prone to failure.

--

--

--

notes from the field of software engineering

Recommended from Medium

what is difference in “find” vs “find_by_id” in rails?

Database partitioning? What is it, why should I use it?

DevOps: Our way through the CI/CD pipelines

How To Make Dynamic Translation in Database | Laravel 8

5 Common Process to Automate as Best Devops Practices

Deploy A Static Website with Docker

Speed up your WordPress debugging skills using plugins, logs, and a local development environment—…

Easiest way for offline installation of geopandas & add it to jupyter notebook kernel on windows

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Nick Ang

Nick Ang

Software Engineer @ Shopify. Dad, rock climber, writer, something something. Big on learning everyday.

More from Medium

Managing a Global Team

Engineers had a week to choose their adventure. They chose open source.

Best of 2021 in Tech [Talks]

What writing a book taught me about failure