Distributed Elixir App in AWS, PT1 — Let’s build the application

Paweł Dawczak
5 min readJun 19, 2020

--

Let’s build a distributed application. A very simple one, but the one that will show us some interesting building blocks and demystify some concepts.

This very simple application that will track automatically nodes in the cluster, will allow dynamically displaying the list of nodes over web sockets. It will be deployed to multiple EC2 instances in AWS and compiled locally, using Docker.

This will be a series of posts:

It’s not a comprehensive guide and will not be applicable in all use-cases, it will not meet all the best security standards, as this guide is intended to show some interesting concepts and give a starting point for your development.

Interested? Let’s start!

Start a new project

It will be a very simple app and we will not need DB access, but we will use the new Phoenix’ generator to provide the initial code for LiveView, which we will use for real-time capabilities:

> mix phx.new distfun_simple --no-ecto --live

First, let’s implement a GenServer, that will track all the nodes connected in the cluster. Let’s build the initial functionality that will allow us starting a new process:

Fairly standard implementation for GenServer. Important is the point 1.. It will register the process to monitor the nodes that will join and leave the cluster. We will get back to what it provides a bit later…

Next, let’s add a very basic public interface that will allow returning the list of all nodes in the cluster the process is aware of:

It will return the list of all nodes, including the node the application is running but will make it a very first entry in the list. It will be useful later, for demonstration purposes, when we will hide all the apps behind a load balancer.

Next, let’s handle the information the process will receive when a new node will join the cluster (it will be of a form of {:nodeup, node}) or leave the cluster (it will look like {:nodedown, node}). Both of them will be handled by handle_info callback:

It’s only a few lines of code, but it already starts to feel quite messy, so let’s take this opportunity to refactor it a bit. First, let’s introduce a submodule to store our state:

Next, let’s use the new functions in our GenServer callbacks:

Great! Looks much cleaner!

Next, let’s make sure the ClusterManager will start alongside the app. In order to do so, we need to modify Application:

We are nearly ready to test it in action, but there is one more change we need to add. At the moment, Phoenix web server will start on port 4000, but we will want to start more nodes, so to avoid port conflicts, let’s make the port to be configurable.

Let’s open configuration and change:

And let’s see it in action. First, let’s start two instances of the app. In two separate terminals, run the following:

> PORT=4000 iex --sname a -S mix> PORT=4001 iex --sname b -S mix

And now, from one of the instances, let’s try to connect to the other one. Here, I’ll execute the following from my b node:

Looks promising! Let’s test the following in the other:

Great! It works!

In the next step, let’s open our ClusterManager for other processes to listen for changes in the registered nodes’ lists.

Firstly, let’s add the changes to our State:

Next, let’s update the interface to allow registering processes. Upon registering, we would like to give access to a copy of the already stored nodes. We can do it like that:

Our ClusterManager will store a list of all listeners interested to be notified about changes in the list of nodes in the cluster; but processes can finish their work, exit, or even die and will be unable to de-register themselves. This is why we set the monitor, so as soon as the listener goes down, our ClusterManager will receive a new message indicating this fact.

When a process monitors another process, and that process “goes down”, the message will be delivered to a monitoring process and will be a tuple of the following format:

In case of a GenServer, this message will be handled by handle_info callback. All we have to do is to add that function that will handle the message, let’s do it next:

Great! Now, when we have the infrastructure in place, let’s add functionality to notifylisteners.

First, let’s add a function that will broadcast the updated list of nodes to all listeners:

It will accept GenServer’s internal state, and will return it. It will allow us to compose it nicely with the rest of the code using pipe operator. Let’s do it next - in the functions that are invoked every time a node joins the cluster or leaves it, let’s change the code to look like the following:

With that in place, let’s give it a try and see how it works — shall we?

Let’s start two iex sessions again:

> PORT=4000 iex --sname a -S mix> PORT=4001 iex --sname b -S mix

In shell a, let’s use this new function to get a list of nodes and subscribe:

That’s correct, for now, we have two instances running separately and the next thing to do is to connect them. Let’s do the following in session b:

and back in session a, as it’s the iex session’s process that’s subscribed for updates, we can use flush() to see all the messages in its mailbox. Let’s try it next:

Perfect! It did receive a new message with the updated list of nodes! Next, let’s kill the session b, that is connected to the cluster, and see what happens! In the session b, hit ctrl-c + ctrl-c, and again in session a:

It works!

Going Web

Now, when we have lower-level components ready, let’s try to expose the information through a web interface.

The way we generated the app, it already set up a new route for us, for live-view. Let’s change the URL where it will be mounted as follows:

Next, let’s update the PageLive. Let’s make mount to look like the following:

As mount will be invoked twice - first time, when the user “visits” the page, it will perform “standard” HTTP call. This will return a “static” web page, and for this purpose, it will be just enough, to obtain a list of currently registered nodes: get_all_nodes().

However, after this “static” page is loaded, and JavaScript will initiate the Web socket connection, it will invoke mount again! But this time, it will be a new long-running process. connected? is a function, that helps to determine exactly that, and it’s that case, we want the WebSocket’s process to subscribe for the changes of the list of nodes.

Now, when the WebSocket’s process is subscribed for the changes, every time the change will occur, the process will be notified and will receive a message that will be handled by handle_info callback. Let’s implement it next:

All it will do is to receive the new list of nodes and assign it to the socket. So last piece of work we need is to provide a template for displaying this list. Let’s add it next:

With the changes in place, let’s start two servers, visit them in separate browsers, navigate to the live-view page, and then, connect the nodes:

Demo of the LiveView page in action

--

--