Monitor Crypto Events with Akka Stream

Moving the first steps from architecture to implementation is always thrilling. That is the reason why, as developers, we tend to rush into coding: we want to make things real, see the cogs fit into one another and the whole beast move.

My research project is about looking for patterns that correlate announced crypto events (such as a conference, a fork or a product release) with fluctuations in the relative cryptocurrency’s parameters such as price and volume.

The previous post was about the progress from an Event Storming session to a proper architecture that takes into account all needs of the system. This post focuses on the implementation of the service gathering data about crypto events and coin price fluctuations. See the previous post for a detailed overview of the entire system.

In terms of Domain Driven Design (DDD), we have three aggregates in there. Each aggregate may contain multiple entities and value objets.

  • The Crypto Event aggregate needs to monitor crypto events from the source and pass this information on to both the Coin aggregate and the Pub/Sub queue.
  • The Coin aggregate, once it knows what events are going on, needs to monitor the fluctuations in price of the related cryptocurrency and post this information on the queue as well. The “Let it Crash” design pattern teaches us that a failing service or component should be able to spring back up to a healthy shape, so this aggregate will save the list of events to monitor in a persistence layer, to be recovered on restart.
  • The Alert aggregate should be the one notifying all failures to the external world. It is common to multiple services, so it doesn’t appear in this post.

Coming from the process of figuring out aggregates, I tried something new and modularised my codebase around the aggregates themselves. What you see on the left is a package per aggregate (coin and cryptoevent) plus the service and config packages which contain common logic. Within each aggregate package, you find all entities and value objects that are needed to accomplish the aggregate’s duty (see image below).

Every aggregate interacts with the outside world through an “aggregate root” object, a sort of single gateway to ensure internal consistency. I am a big fan of Actor programming: their asynchronous communication embodies the concept of isolation and decoupling, making them the perfect candidates for aggregate roots. Each actor establishes a communication protocol based on asynchronous messags and that is the only way to get in touch with it. 
Note that only an actor can exchange messages with another actor; if you introduce one, likely you must base your entire interaction on actors. Not a bad thing per se. Actors come in swarms.

Let’s have a closer look at the Crypto Event aggregate. Essentially, it needs to interact with some API at regular intervals, parse the results and forward it. We need to interact with two endpoints — one to receive an Auth token and one to retrieve actual information about crypto events. I will have a Crypto Event actor taking care of this flow, and a Token actor acting as the guardian of the Auth token. The Coin aggregate needs to interact with an HTTP endpoint and forward its outcome. A Coin actor will take care of all this.
In the diagram below, the green square indicate that the actor is also the aggregate root component.

We are getting closer to the implementation level and this is the right moment to introduce a cornerstone of reactive systems: the streaming approach.

A Stream represents a potentially unbounded sequence of items in sequential order, whose rate of appeareance may be fixed or vary in time.

Modeling real-world process as streams is often the most effective way to nail down behaviors. It could be a flow of birds, falling stars, cars that drive through a tunnel or customers at the counter of a supermarket. This can be all seen as streams of items. Once you enter this world, it will be hard not to.

The Akka toolkit provides a very expressive streaming abstraction called Akka Streams. Processes can be modeled as sequences of Sources of items, Flows that process such items and Sinks that are their “final destination”.

Furthermore, the Akka team mantains a community-driven project named Alpakka, which offers a set of connectors to many external services (and whose name is a clear reference to Camel). You can use Alpakka to combine streams coming and going to FTP servers, Kafka clusters, S3 storage, HTTP or Websocket endpoints, databases, queues, TCP sockets and many more. All of them will take backpressure into account so not to overflow your system. If you want to get a glance of Akka Stream’s power, the talk that Colin Breck gave at the Reactive Summit 2017 is the one that inspired me the most so far.

If you remember my Event Storming session, I have an incoming stream of days. Every time a new day comes, the Crypto Event actor needs to poll the event HTTP endpoint, announce its result to a Pub/Sub instance and to the Coin actor. The flow of operations could be something like this

Simple, right? Except we know that lots of stuff can go wrong when the internet is involved. The red box you see above is actually a more complicated flow in itself: for instance, we need to refresh our Auth token if expired or notify if things go wrong. The flow below is the red box expanded, which takes an Auth token as input (left) and outputs a HTTP response (right).

The HTTP filter on the right is able to see if our token was invalid. We then try to refresh the token and we try again with the new one. The Auth flow is another red box, as lots can go wrong in requesting a token, too.

Time to bring all pieces together in a massive image. If you take the time to navigate through it, you will have a complete picture of where everything sits.

For a little Scala code, here are a couple of simplified excerpts from the Crypto Event flow. This is just for the idea — much is missing and much is hardcoded!

The custom event request flow is implemented using the GraphDSL syntax. As Stefano Bonetti recommends in a recent talk, the GraphDSL is a powerful weapon, to be used only when necessary —like logic with loops or fan-in and fan-out operations.

This stream is meant to never stop. In case it does, the actor that embeds it fails, which triggers its supervisor to restart it. This is a recommended good practice that embraces the let-it-crash concept of restarting components to a fresh state.

We took a ride from an architectural overview to a little implementation detail, and I hope all passages are clear. The next step takes us on a cloud!