Subscribe to messages with pattern matching
Lasse Ebert
704

TL;DR: I like your pattern match implementation. Also some ideas that might help you and a shameless plug for Flub, a similar Pub/Sub that is in real world production.

I really like your use of Macro.escape/2 for building a portable pattern, and wish I’d found it a couple years ago when I built my (very similar) solution to the same problem. Having used it in production for about two years (Flub, and it’s unpublished predecessor), I have learned some lessons that may help you build your Hub module into a production-ready application (note: Channel == Topic, and Dispatcher == Hub):

  1. Metadata. Letting a subscriber know which channel the inbound message was published on is worth the pain of a larger handle_info/ receive signature. Wrapping the delivered publication in a map/struct that minimally identifies the channel (could include a subscription ref, if you wanted to be fancy) makes life better for the subscriber.
  2. Overworked Channel Hubs. I once made the mistake of having N subscribers to a channel, where each was subscribed to 4 different pattern matches; each time that dispatcher tried to publish a message (10–11/second), it had to compute 4N matches. When N started to scale, the hub started to backup. The solution for me was to use a finer-grain Channel structure, so that each Topic was more focused; this lead to having more hubs, and spread the computational load across more GenServers The system was under the same load, but the Erlang VM scheduler could take advantage of multiple cores and improve overall throughput.
  3. Separating the Macro. After using more channels, I found that I was pattern matching less, so I moved the pattern match into it’s own macro, and passed that as an option to the subscription. This naturally gave rise to an easy syntax for other options, e.g. cross-node subscriptions. As a side benefit, only subscribers who need the macro have to use animport statement.
  4. Pipeline design. Elixirists love their |>, and by putting the published term second, you’ve made it harder to pipeline (I did the same thing with my API, at first). By swapping the order, you gain the ability to pipeline the results of an operation into a publish. Also, returning the published term allows you to continue operating on the term, a la IO.inspect.
  5. Subscriber Focus. This was a design choice I made early on that benefited performance greatly. In Flub, calling pub/2 is essentially a free operation for the system if there are no subscribers: the Dispatcher is started in response to sub/2, and then I use :gproc to do the heavy lifting of deciding whether a pub can be a no-op. One downside to this design decision is that I cannot pattern-match on Channels, but only on published terms. I’m still trying to come up with a solution to that limitation.

I hope some of these thoughts are useful to your continued development: If you are interested, check out Flub, which has been in sustained production in 3 unpublished applications for the past two years.

Like what you read? Give Chris Meyer a round of applause.

From a quick cheer to a standing ovation, clap to show how much you enjoyed this story.