Pub and Sub… what do those have in common ?

Essential RubyOnRails patterns —part 5: Pub/Sub

Błażej Kosmowski
Aug 11 · 22 min read

as seen by RubyOnRails Developers @ Selleo

Foreword

What is Pub(lish)/Sub(scribe) pattern?

“In software architecture, publish–subscribe is a messaging pattern where senders of messages, called publishers, do not program the messages to be sent directly to specific receivers, called subscribers, but instead categorize published messages into classes without knowledge of which subscribers, if any, there may be. Similarly, subscribers express interest in one or more classes and only receive messages that are of interest, without knowledge of which publishers, if any, there are.”

Even though it seems to be similar to the observer pattern, the fact where subscriptions are stored distinguishes it significantly.

Why should we use Pub/Sub?

One of the simplest means to better implement SRP

It might be simple to comprehend and apply in simple scenarios, but in large, production-grade applications with complex business logic it is a challenging one and requires applying techniques that may increase indirection significantly. This is because it is not always clear how much business logic steps/rules can we bundle together? Should we decompose a given class into smaller ones, or maybe it is not worth it and the class is fine as it is?

Pub/Sub addresses this issue in a way, in which you can assume most of the time that your class/object should just accomplish one thing and communicate it to the rest of the application — then subscribers will take care of all the side-effects of the thing you have just communicated.

Facilitates spaghetti-code reduction

Makes applications more modular

The potential simplicity is based on the fact that we have a ready-to-use medium for communicating with modules extracted this way — and those are events/messages. Sometimes it might be necessary to change a bus that is used to propagate those events, but the interface (or event’s payload) in most cases can stay the same. This makes moving towards decentralised/asynchronous architecture much more natural and can drive an application to be much better prepared for potential scaling.

If we decide to use an external bus for conveying events, we can also leverage the fact that we can implement event handlers in a different technology/stack than our core one. This allows us to use the best tool for a given task with ease.

Renders breaking large tasks into smaller ones easier

Makes logging significant events in the system easy

In Pub/Sub, events occurring in the system convey lots of comprehensible data that outline what is happening in the system. Plugging in into this stream of information is just a matter of creating a subscriber dedicated for logging and redirecting the before-mentioned stream to either file, STDOUT, database, CloudWatch, external logging service or any other destination that might prove itself useful in our case.

Observing such a stream of events can be beneficial in visually identifying anomalies that would otherwise be hidden in the abyss of regular server logs. Adding event tracing into the equation to indicate how events are correlated with each other can make it even more useful.

Simplifies data migrations

Using events for this purpose makes the process more comprehensible and readable. Also, all side-effects of such changes will be transparently applied as well, unless we explicitly unsubscribe from those events. This might be useful to handle some special cases, and is a powerful technique itself.

When to use Pub/Sub?

  • medium-sized to large-sized
  • with easy-to-identify domains/modules that need to communicate with each other
  • integrating with one or more external systems, especially if those integrations are broad
  • huge, entangled monoliths in need of refactoring
  • applications in need of modularisation for scaling or other purposes

The more characteristics shown above apply to a given application, the more benefits Pub/Sub will bring to the table.

When NOT to use Pub/Sub?

  • small to medium sized
  • with a scarce amount of domains/modules or ones that are difficult to identify
  • not integrating with external systems
  • already implementing some specific/dedicated, well thought-through architectural solutions that would neither benefit nor play nice with Pub/Sub
  • prototypes that are just experimental, in which case Pub/Sub would just add unnecessary overhead

It does not mean that if your application shares some of the traits above then Pub/Sub will definitely not work in your case, although it might prove not to be worth the extra effort it needs to be introduced effectively.

A word on nomenclature

And thus “messages” are referred to as “events”, “subscribers” are “event handlers”, “topics” become “domains” and the concept of “message bus / broker” turns to programmatically defined “subscriptions” (usually in pub/sub subscriptions are realised by the infrastructure itself).

BDD the Pub/Sub way

To expedite the introduction or transition to Pub/Sub approach it is good to have a set of guidelines one can follow when working with new requirements in the system. Following the behaviour-driven-development principles might be an example of such a rule-set. Some steps outlining this approach in the context of Pub/Sub are presented below.

Identify problem domains

“There are only two hard things in Computer Science: cache invalidation and naming things.”

— Phil Karlton

I usually prefer to introduce names after adjectives or nouns that revolve around concepts related to the business domain or around types of the services I integrate the application with. Messaging, Ordering, Inventory, CustomerRelationshipManagement, IOT… there is not an easy rule here — something that might be totally confusing in the context of one app can be crystal-clear in the context of another. Rule of a thumb might be — if a domain name is clear for your customer, and when using the name you feel that both of you are talking about the same thing, then it is not a bad name. If we know the right names, then it is time for…

Identifying events

Events are all about something that has just happened, so should usually be named after verbs in their past-tense form. There might be a temptation to derive event names from commands, i.e. RequestEmailDelivery, yet those are no longer events — those are… commands. While those live really close to events, it is good to keep both concepts separate, as it might be really useful in the future. In our case, EmailDeliveryRequested would fit the expected pattern better.

Event names should be unambiguous and, same as domain names, should be understandable by the customer. In fact domain names, event names and data they carry should become the new language we use to talk about the product and describe features. It would also be beneficial to maintain some sort of glossary clarifying the meaning of the names we use.

Some examples of valid event names are CustomerCreated, OrderNotificationSent, DoorOpened, NewOrderReceived, PricingMismatchIdentified, etc. If the name accurately describes what has just happened, then it is probably a good event name.

Events should also be assigned to specific domains. Still, it might happen that events sharing the same name could be published by different domains — this is not a problem at all.

Identifying events’ payload

  1. Including irrelevant data — usually happens when planning payload for subscribers and can result in a payload that is irrelevant in the context of the given event. Usually, this should be included in a different event or retrieved with different means (i.e. fetched from the database when handling event)
  2. Including too much data — similar to the one above — only data that makes sense to be included in the scope of a given event should be included
  3. Including too little data — especially if missing data is necessary to describe the context of the event in full
  4. Improperly nesting data — in particular when some data that should be nested is not. A good example is including all of the object attributes as separate fields of event payload instead of wrapping them in the attributes field.

There is also a recommendation not to include anything in the payload that is not actively used by all subscribers and just introduce more, fine-grained events. In my opinion, it might be a bit too extreme in most cases though.

Examples of some potential payloads

  • CustomerCreated: customer_id, customer_attributes
  • OrderNotificationSent: order_id, recipient_id, recipient_email
  • DoorOpened: door_id, keycard_uuid
  • NewOrderReceived: order_id, order_placed_at
  • PricingMismatchIdentified: product_id, local_price, remote_price

For sure examples above are not set in stone and are not the only right ones. Payload we define should always depend on the context in which a given event is broadcasted.

Time to TDD!

Integration testing

TDD Publisher(s)

It is worth mentioning that each unique event should preferably be emitted just from one place. Not only this will make it easier to comprehend the application flow later on, but in most, if not all cases, it just feels right.

TDD Event(s)

TDD Subscription(s)

One of the recommended approaches to organising subscriptions is to listen for events globally, instead of in some specific context. This allows us to see all subscriptions in one place and quickly get a grasp of how the whole thing works. Also this way we prevent coupling of different problem domains which might be introduced by on-spot subscriptions.

In this place, we also usually decide if a given event should be handled synchronously, asynchronously or in some other, specific way. This aspect should be covered by tests as well.

Another optional assertion would be to ensure if both event and event handler classes exist, which would add more “integration” value to such otherwise very lightweight test scope. This would also drive further implementation if we decide we want to start by implementing the subscription part first.

On many occasions subscription is something we just forget about and something that might not be an obvious bug during the manual testing phase. This is especially true when an event is already emitted, so it just feels right to write a handler for it… and forget about “the glue”. It is usually beneficial to introduce some kind of linter that will verify if any of the handlers we have introduced is not just dangling in a void without anything that can actually call it.

TDD Subscriber(s)

Controlling flow with events

When it comes to the actual flow, it is best to not assume any particular order of events’ processing. Sometimes selected event handlers do depend on other event handlers in terms of the order of execution though. So what are the options of controlling the flow of execution if we really need to ensure that given steps are realised (events are processed) in a certain order?

Bundle event handlers together

Let’s say after an order is created then a message should be sent to the customer first and another message should be sent to the shop manager only if the sending of the message to the customer succeeded. In such a case, you might want to introduce one handler in the “messaging” domain that will subscribe to the “order created” event and will ultimately result in emitting two new events: “new order notification sent to customer” and “new order notification sent to shop manager”. In the example below, sending messages could be handled by the service object, composite or event handler itself. Such an event handler can be executed synchronously or asynchronously as it does not make any difference in this case.

Process events synchronously

While this approach is the easiest one to implement, it can be prone to introducing bugs, especially if not thoroughly tested. For instance, let’s assume that somebody sorted our subscriptions alphabetically — it does not seem to be a problem at a time, but can cause some discrete problems and inefficiencies later at runtime that might be hard to track. In the example below we want to ensure that a customer is created in CRM service only after we confirm that we have sent him a message successfully.

Process events asynchronously

To ensure the correct order of processing we need to take care of confirming that a given job was processed successfully, handling errors and retries, maintaining performance not to clog the queue etc. Therefore this measure should be applied only if no other option can be introduced as it leads to a significant amount of complexity.

Chain of events

In most cases using chained events is the best course of action. It might be challenging to make it work if triggering one event handler depends on more than one other event handler/event to be executed/propagated. There are different approaches to this problem based on retrying event handlers until some condition is met or aggregating events and emitting new ones as a result but many times it is good enough to just rethink the whole problem and organise the process in a different way.

Events bus

  1. Synchronous — in-process, handled fully synchronously — can be effective when we prefer the order of running event handlers to be preserved and when event handlers are lightweight. Also great as a starting point when pub/sub is used as a means for refactoring large, legacy applications.
  2. Asynchronous — handled in a separate process/thread within the same environment — non-blocking, usually based on some sort of queue (i.e. Faktory or Sidekiq). Can benefit from all features offered by the queue, like statistics, logging, retries, fine-grained error handling etc. Should be considered as the first choice.
  3. External — based on separate service like an external message broker and/or notification service (i.e. AWS SQS / AWS SNS, RabbitMQ). A step toward a solution working in microservice oriented architecture. Requires additional integration and may introduce a small performance overhead, so it should be introduced only when necessary. Still, it can be used as a secondary event bus within an application for handling some special cases / integrations.
  4. Logger — a simple medium when it is the user who acts as a subscriber of events. Such events data can be stored in a file, redirected to STDOUT or even some external service like Cloudwatch for further investigation. In some emergencies, such persisted events stores can even be used to restore or revert the state of the system.

It is not uncommon for applications to take advantage of a few if not all kinds of event buses presented above, yet every decision to introduce a new one to the system should be backed by some solid reasoning. This is because choosing one or many specific media of propagating events may introduce numerous consequences.

For instance some media guarantee at-least-once-delivery so the concept of idempotency should be taken into account to handle events delivered more than once. Also, some (especially external) media can be temporarily unavailable, therefore planning for error handling to ensure eventual consistency seems to be critical.

Special uses of Pub/Sub

Refactoring monoliths

Existing code can be augmented with event emission capabilities which should not affect the original logic at all, while the new behaviour, isolated and properly tested, can be handled within event handlers. Introducing Pub/Sub can also be used to decouple tightly-bound parts of the system, especially if the coupling is introduced by callbacks pattern.

Auditing/Logging

Costs and risks of introducing Pub/Sub

Requires mindset change

Correlations visibility

It is possible to register event handlers just where the event is emitted to further increase the visibility of what is going on, but personally, I am not a fan of this approach, as it further expands responsibilities of the class publishing the event. “The glue” is all over the place.

Inflexibilities in changing payload structure

Pub/Sub and Ruby on Rails

Implementing Pub/Sub using Wisper gem

The problem domains / scopes we can identify in this scenario are: reservations, digital locks and GRM. What it seems we are interested in, is the moment when a new keycode is generated for a lock (possibly in the context of reservation). So the potential event would be something like Locks::CodeGenerated and will hold the actual keycode and reservation identifier in its payload.

The code presented below is a sort of over-simplification and takes advantage of a few helpers and base classes that were not included. This was done on purpose to focus only on the actual business value of the solution.

If you are interested in a high level, opinionated wrapper around the concept of publish/subscribe pattern in Ruby on Rails, I do recommend having a look at pubsub_on_rails. It is a robust library answering most of the every-day programmer needs when working with pub/sub and has pretty comprehensive documentation including examples of testing each part of the flow. What is more, it is battle tested on a project that has already processed nearly 40M events.

Other recommendations when using Pub/Sub in RoR

  1. Use “bang!” methods a lot and let stuff fail in runtime the correct way. This is especially useful for tracking problems when payloads of events are not validated and we forget to provide some identifiers when emitting those events. Adhering to this rule benefits the most in the context of ActiveRecord finders.
  2. Use external system ids, if those are unique — as Pub/Sub excels for integrating many external systems it is useful to use identifiers provided by those external systems even as primary keys of objects we persist locally. Further using them in event payloads will undoubtedly make it easier to audit events that were broadcasted in the system.
  3. Consider maintaining a clear separation of domains — regardless of the technique used, be it dependency injection, decorators or even CBRA / Packwerk, try to introduce interface segregation for models/objects that might share the same database table but are used in different domain contexts. Asking yourself the following question: “If I remove this whole domain from the application, will it still work?” can help you decide on whether it needs more isolation.

Summary

This being said, Pub/Sub can be an extremely useful tool for building flexible and decoupled architectures that are subject for scaling. It also facilitates planning development by making it easier to decompose large problems into smaller tasks and promotes a unified language for communication between the product owner and the dev team. Due to this and various other reasons I cannot recommend Pub/Sub highly enough to be investigated as a new element in your application domain context. Also, make sure to check out pubsub_on_rails — great starting point for introducing Pub/Sub into your app.

selleo

Experienced Ruby On Rails, Elixir, Node.js,