How To Master Your Async Queues

Event-driven architecture is the new black! This is the trend everyone is currently following in the industry. It’s full of advantages and there are a lot of articles explaining why you need to stop using APIs now.

Sometimes you need to move things around. Breaking a service into microservices. Renaming some parts of your code to match a rebranding in your business. Redesigning your architecture to be compliant to a new regulation… Using asynchronous communication in this case can be challenging.

Of course you can always use downtime to deploy a new version of your services, but this is far from being agile. True mastery comes when you’re able to manipulate any part of your backend. With no downtime, with no data loss, with no negative impact on your customers.

Let me teach you a set of techniques based on my experience in this area. Manipulating your code base doesn’t need to mean losing your sanity.

NSQ

At Augury, we’re using NSQ queue service for our asynchronous communication — it’s a simple, yet very effective queue service.
Through this article we’re going to use NSQ implementation as an example for the techniques — yet the techniques are compatible with any queue service. Let’s dive into the basic principles of NSQ…

NSQ works with topics and channels. Topics receive messages, replicating them to all the related channels. NSQ makes sure that at least one consumer for each channel will receive the message. A channel relates to a single topic, you can use the same channel name for different topics if needed. Other queue services use similar implementations with different names. The result looks like this:

While good documentation can help understand your architecture, naming matters. We want topic names to reflect the kind of data the message holds — channel names to describe the action performed on the data. That way, looking at the topic/channel names allows us to understand what is going on.

With the basic implementation in mind, you are ready to learn your first technique.

First technique: using idempotency

If you are not familiar with idempotency, this is the coolest attribute your code could have. An idempotent piece of code will always produce the same outcome given a fixed input. It will not re-execute what has been successful the first time.

If your consumers are idempotent, you can manipulate them with ease — you don’t need to be concerned by duplicate messages. It comes in handy when, for example, you want to rename a topic. We’ll use this simple architecture as an example.

1.The producer starts sending the messages in two topics, the new and the old one

2.Migrate your consumers to consume from both topics

3.Stop producing into the old topic

4.Stop consuming from the old channel

5.Delete the old topics/channels in NSQ

In this case, for a period of time the consumers have received the messages twice. The fact that they were idempotent makes it irrelevant.

As software developers, we should strive to design idempotent code whenever possible. For non-idempotent code, there are other working techniques.

Second technique: conquering a channel

Let’s go back to our example, this time the accounting service is not idempotent and also part of a monolith. Extracting your accounting code from the monolith is possible, but you need to conquer the channel.

For the conquest to work, you need to be able to run both the monolith and the new microservice at the same time. Meaning (most of the time) they need to use the same DB during the transition phase. In microservices architecture, we want each service to have its own DB. This is an exception, so for a limited amount of time two services will use the same DB.

To conquer the channel, configure the microservices to use the same channel. NSQ will only send each message to only one of the services. But since they run the same business logic, using the same DB, it is not an issue.

Once the microservice is up and running. You can deploy a new version of the monolith without consuming these events. That’s it, your microservice has conquered the channel.

Third technique: getting ready upfront

This is the most important technique, as it will work for almost all scenarios, but it’s also the most complicated and longest one. Let’s take a look at this architecture.

A reporting service generating reports. Once report generation is over, it produces a message that is replicated to many consumers, which start to process it.

Once the business starts to scale up, we’re adding many types of reports. The consumers start to have a different logic, depending on the report type. Some of the report types are not relevant to all the consumers. Services are consuming events that are not related to them. You need to refactor the architecture to something like this:

Achieving this state is not easy, as we still want to avoid downtime, data loss and negative impact. We need to make sure every consumer is ready to receive messages. We’re going to use a simple architecture to illustrate…

1.Create the new NSQ topics/channels

2.Configure all the consumers (not the producers) to the new topic/channel configuration. Executing the same code as the one with the current configuration, without stopping to use it

3.Deploy a new version of the service, only producing the messages in the new topic

4.Remove the old configuration from the consumers

5.Clean the old topics/channels in NSQ

While this looks like the steps we have used in the idempotent technique, there is a major difference. We need to have everything ready before changing the producers. The first technique allows you to stay in the transition phase for as much time as you need. With this one, it’s all or nothing. We’re producing each message in one topic or another.

Last technique: the ugly one / the timestamp-based one

Sometimes you don’t have a choice, you need to use the technique that no one wants to use. When you want to migrate part of a topic’s channels, without modifying the existing ones. Most of the time, you use the ugly technique when the topic will stay the same while you change the channels. No matter how much you want to avoid this technique, if your consumers are not idempotent you will need to use it.

The ugly one’s secret is having two configurations at the same time. Creating duplicate messages, but filtering them based on an arbitrary time. Let’s say you want to rename a channel to reflect a change in your organisation. Taking our favorite architecture as an example.

1.Make sure the message body contains a created_at field

2.Create a new channel for the same topic with the newly chosen name

3.Pick an arbitrary timestamp after the time you plan to deploy your change

4.Change your consumer to consume messages from both channels
If the created_at field is before the timestamp, treat the message from the old channel
If it is equal or after the chosen timestamp, treat the message from the new channel

5.Once the chosen timestamp is in the past, deploy a new version — using only the new channel, and not looking at the timestamp

6.Remove the old channel from NSQ

7.Stop populating the created_at field if you added it for this migration

This technique shows that there is always a way of manipulating message flows. The timebase trick made our consumer almost idempotent. It requires more time and is error prone, that’s why you should avoid similar techniques as much as you can.

Wrapping up

Techniques are a set of tools. They’re handy and can help you master your code base. Once you become familiar with them, you’ll start to see the patterns. You will understand why they work and why they guarantee data sanity. Understanding this will allow you to develop your own set of techniques. Let the wild adventurers improvise, adapt and overcome. You will learn, understand and master.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store