Stateless and Dumb Microservices on a Message Bus

Published in

Capital One Tech

7 min readJan 16, 2018

High traffic view of a ship yard showing many containers and coordination between them.

The Big Data world is moving to large distributed systems of message passing along a message bus. Sure, we’ve been making API calls and using enterprise service buses for what feels like forever; but by introducing Kafka we’ve changed the game. Kafka flips the enterprise service bus model inside out. Historically, we’d put the logic on the bus, and the code and the routing logic that was deployed on the bus knew how to pass messages around.

We have a new challenge these days since our message bus is no longer intelligent. We have to put routing logic in code or configuration, then we run into the question, “How do we define message routes?” There are a few options, which we’ll go into below, but we need to make the intentional decision on which method we choose. If we don’t, we’ll end up with a “Rat King” of routing rules through our microservices. Rat Kings are nasty balls of wax, think of the worst knot you’ve ever had to untangle, then mix it in with four more knots. This is the last thing we want for our microservices.

When we first think about passing a message along a bus, initially our microservices will read from an input topic, and then output to another topic. That’s two pieces of data per microservice. Seems simple enough. But when you want to change the path, or have a dynamic path, things get more confusing.

What I’m proposing here is a truly stateless component that has no idea “what’s next” except in the case of an error. Let’s visualize this with a little example.

So You’ve Decided to Get a Gorilla for Your Zoo…

Bring a shipping manifest to mind. Inside the box (payload) is the item you’re going to ship. If you order a gorilla from another zoo to be shipped to your zoo, they will put the gorilla in a cage inside that box. On its way to you, you may want the gorilla bathed, nails trimmed, and fluff dried. Oooh, let’s get two gorillas for our zoo! The second one doesn’t want trimmed nails — she’s just particular like that. The second gorilla will get bathed and fluff dried, no nails. I repeat, NO NAILS!

Now that we’ve talked about our use case of bathing and shipping gorillas between zoos, let’s talk about how we get into the approaches proposed below. If you think about the gorilla processing capabilities as a list of services offered at a groomer, then how do you tell the groomer what to do?

You’d walk up to the groomer, give them your gorilla, and tell them the services you want. In the technology world, it’s much the same thing. We can expose an API that “anyone” can call. We can provide parameters that can be used to determine the features that we need, or we can provide different end points with predefined paths.

The Old School Model

Flow chart diagram showing the thought process of grooming your gorilla.

If you build an old school message bus, your ‘bather’ needs to decide where the message goes downstream. The more options you add, the more confusing it gets. With our simple example, we can build an easy route, but imagine there were fifteen more options in grooming. The gorilla needs a facial or a foot massage? Nail polish and buff? Then maybe you want to add an optimization algorithm to send the gorilla to the next shortest queue. This is where it starts to get confusing.

The Conductor (or Choreographer)

Let’s introduce a new concept, the conductor. When you ship something like a gorilla (I mean, who DOESN’T ship a gorilla every weekend?), you tell the logistics company, or the train company, where to send it and what to do with it. I say, “The first gorilla gets bathed, trimmed and fluffed and the second gorilla just gets bathed and fluffed.”

In this concept, we can send the gorilla to get bathed, then back to the conductor to decide where the gorilla goes next. That’d look something like this:

View of an assembly line, grooming your gorilla.

Wow, the conductor is going to be busy! Talk about a single point of scaling, or even failure! Distributed version control like GitHub makes it easier to keep everything up to date with different development teams, but it can very quickly become an ugly Rat King of routing choices. Not to mention the conductor will end up listening to many, many topics in a Kafka-based implementation, or even a single meta-topic, that every microservice publishes back to.

The Manifest

Next, we can explore the manifest. A manifest is a list of stops. In our case, let’s build a stack (yes, remember that data structure?) of topics the message needs to go down. Our gorilla would have an initial set of GROOMER, NAIL_TRIMMER, FLUFF_DRYER” or “GROOMER, FLUFF_DRYER depending on which gorilla it was handling.

Now, your conductor does some work up front (yes, new scenarios will cause new code changes) to set up the initial stack. Once the stack is built, the conductor passes the gorilla (message) off to the first topic in the stack. The microservice (ah ha! See? I did it!) that’s listening to that first topic (GROOMER), will pop its name off the stack, groom the gorilla (perform whatever work on the message, enhancement, errors, drop it, etc.) and peek at the next item on top of the stack. If the next item is NAIL_TRIMMER, it will publish the enhanced item to the NAIL_TRIMMER topic. The NAIL_TRIMMER microservice, will do the same — pop its topic off the stack, update the item (or trim the gorilla’s nails) and then peek at the next topic.

Orchestration of a gorilla getting groomed using a message bus.

So, it continues until the last topic…something like DELIVER_GORILLA_TO_CUSTOMER. That’d look something like this:

Believe it or not, this feels like a much cleaner model to me. Each component is dumb — they can be a shared library or module to do the routing, with common logging, error handling, etc. Put a little callback function in your code per microservice, and it’s pretty straight forward.

Here’s an example of what the message structure can look like:

The challenge with a predefined message path is just that — it’s a predefined path. If there’s a branch or an option anywhere in the middle of the route, then you have to almost break the single conductor model since special microservices would have to push a new topic onto the stack. That means they need to know more about their own topic for their own work.

As long as this pushing is done very infrequently, and is very clearly an exception in the overall process of these messages, I don’t expect it to get out of control. The challenge will be dealing with the eventuality of too many exceptions. Then you’re back to a Rat King of routing, asking yourself repeatedly, “Now why is that service processing the message?”

Caveat

One key point of discussion here is around ‘design time’ vs ‘run time’ options. My gorilla bather ALWAYS cleans the gorilla’s ears, regardless of whether I want them to or not. Even if I tell them to not clean their ears, they’re going to do it anyway. That’s a design time decision. My gorilla bather always gives me the option to trim my gorilla’s nails or not, and I’ll specify my preference when I give them my gorilla. That’s a run time decision. If your routes are fairly static and don’t change frequently, I’m all for the Manifest pattern. If you have a fair amount of run time decisions on routing, then the Conductor pattern could be your best. This is up to you. However, my advice if you follow the Conductor Pattern is to make sure to have a clear and concise way of evaluating rules. Configure a rules engine of some sort to keep state and make the decision.

To Wrap Up

I’ve introduced three routing patterns that occur when writing microservices that process data in a serial fashion. There are plenty of other options available; one that intrigues me is using a rules engine to dictate where a message goes next. Imagine a configuration file for different steps defined in a meta-language like Gherkin (used for test-driven development) to define where messages go. If your gorilla is already clean, and wants to get its nails trimmed, then send it to the Nail Trimmer.

In summary, be very intentional when choosing a long-term structure when you’re building out a framework or a pattern of integration when building smart code around a dumb message bus.

Would love to hear some feedback on the thought! Tweet me @chrisfauerbach or send me a note at https://fauie.com/.

Make sure to check out my other blog post on the topic — Inverting the Message Bus.

DISCLOSURE STATEMENT: These opinions are those of the author. Unless noted otherwise in this post, Capital One is not affiliated with, nor is it endorsed by, any of the companies mentioned. All trademarks and other intellectual property used or displayed are the ownership of their respective owners. This article is © 2018 Capital One.