Testing in Elixir:: Chapter 4: Processes, processes everywhere
In this chapter, we are going to look at Processes. These are the integral building blocks that allow Elixir to go from
"Oh cool, it's like a functional version of Ruby" to
"Oh my god, it's literal witchcraft" .
In a good way, obviously.
It should also help raise awareness of some of the implementation details of Elixir and Erlang. So, let’s start with the design goals of Erlang itself.
Joe Armstrong, co-inventor of Erlang, stated the view for the language should be based on the following aims:
Everything is a process.
Processes are strongly isolated.
Process creation and destruction is a lightweight operation.
Message passing is the only way for processes to interact.
Processes have unique names.
If you know the name of a process you can send it a message.
Processes share no resources.
Error handling is non-local.
Processes do what they are supposed to do, or fail.
As Elixir is built upon Erlang, we are able to take full advantage of these design principles for free.
Processes, and the concept of message passing, are not new, and neither are they a unique feature to Elixir/Erlang. The roots of this design can be found throughout computer science, most notably the Actor model and the underlying aspects of OOP. The success of Erlang’s design has lead to similar implementations in other languages, for example Akka in Java/Scala, and Celluloid in Ruby.
Focusing on Elixir, everything runs inside of a process. We can demonstrate this by opening an empty iex session and doing the following:
Here, even with an empty iex session, we can see there are multiple processes running to make the REPL available to us.
These processes are incredibly cheap in terms of CPU and memory, so spawning up thousands of processes isn’t a big deal. We can see this by keeping the observer window open and using the following example from Programming Elixir:
If we observe the Load charts when running the example we will see something like this:
So, even when spawning up a million processes, we utilize less than 50% of our scheduler — and the memory usage has a lovely symmetrical peak and trough. Pretty good.
If you didn’t have the observer open, the only thing you would notice is the c5 seconds it takes to complete the request.
Let’s move on to some simpler examples to understand what is going on, and how to test them!
Here we have a very simple process module. The
spawn a process that will execute the function given as an atom (in this case
:receiver) from the module supplied (in this case
__MODULE__ , which is a shorthand reference for
Chapter4.BasicProcess ) and pass in the supplied arguments — here, an empty List.
receiver/0 function declares the types of messages it can handle. This follows the underlying design of message passing:
If the object responds to the message, it has a method for that message.
If we recall the designs of Erlang, this new Process has one job:
Processes do what they are supposed to do, or fail.
Rick and Morty fans can think of Processes in terms of Mr Meeseeks, except the length of time it takes to do their job doesn’t result in pain… This means asking it to take two strokes off your game is possible!
Running this code will give us the following:
This process, affectionately known as
#PID<0.126.0> will just sit there and wait patiently until it receives a message. We can check it is waiting there by asking if it is alive:
We can also find out more information about our Process, so let’s do that:
This is the internal representation of a Process. The important parts for this article is that it is currently waiting, and that the process mailbox,
messages: in the picture, is empty.
So, as it is waiting, let us send it a message:
Yay, it responded.
Now let’s check on its health:
Annnnnd it’s gone :(
Let’s look at testing this simple scenario of spawning a process and sending it a message:
In the first test we actually transform the input from
Process.info/1 into a Map
line 9 . This allows us to pattern match on the important parts. This is useful, as PIDs are dynamic and so having to hardcode everything to match what exists inside a List just wouldn’t happen.
In the second test we are sneaky and take advantage of
:erlang.trace . This will allow us to reach into the underlying Erlang and keep track of the messages our spawned process receives.
We then send a message to our process and use ExUnit’s
assert_receive to verify that our process did indeed receive the message in the format that we sent it.
You can see there are a lot of small moving parts here, and hopefully you have a sense that this is probably at a level lower than we actually care about for our application…
Currently, our process is dying after one job. This isn’t ideal, so we should address that by giving our process a tiny bit of “persistence”.
However, before that, there is a small but interesting edge case that we should look into:
When creating / destroying a Process, there is actually a small overhead associated with it. This means that if we are quick enough, we can send a Process multiple messages and it will receive them. However, if you recall, a Process only has one job. This means that sending it multiple messages will result in them all living in the process mailbox — however, only the first one received will be processed. The rest will just remain in the process mailbox until they are garbage collected.
I certainly wasn’t expecting this and thought it was kinda neat. Useless, but kinda neat. You can add it to your pub quiz knowledge for the next time there is a round on Elixir process quirks.
It is possible to transform our Process from being able to only handle one message, to handling multiple messages. All it takes is one line of code:
This means that after handling a message, the function will call itself again and await for another message.
How this works is really subtle, but incredibly cool and worthy of a knowing golf clap. So, here goes!
When you spawn a process, it creates a unique PID. When a process dies and you create another one, you will be assigned another unique PID:
This isn’t particularly useful, so by calling the function again as the very last step of a function call, we perform a recursive call. Typically, this would mean adding a new stack frame to the call stack. However, Elixir / Erlang are part of a group of languages that take advantage of
Tail-Call Optimization .
What this means is that the function is evaluated as ‘calling itself’. Rather than allocating a new PID or creating a new instance of the function, it behaves similar to a
GOTO command and just jumps up to the top of the function again. This essentially acts as the closest Elixir will ever get to a loop.
With this in mind, let’s see this in practice — firstly in the command line, then in tests:
Note here that throughout we have kept the same PID.
This is similar to the steps taken in the command line.
Side note on sending multiple messages:
Similar to the overhead in the
BasicTest there is a slight overhead on receiving messages.
line 36 contains a sleep of 1ms, this is necessary to guarantee consistency with the tests passing.
So far our messages have been one way — they don’t need to be. We can send messages back to whoever/whomever has sent us one, typically as a saner way of confirming that a message has made it to its destination.
This is done with only a slight tweak to our
Here we add another argument to our message tuple called ‘caller’. This is where we will pass in the PID of the process that is sending the message so that it knows to respond. We can see it in practice in the command line:
Here we can see that we pass in the PID of the iex terminal as the author of the message, and we can see that we get a response back from the
ResponderProcess . We have to use
flush/0 to empty out the process mailbox — this can be useful when there isn’t an explicit way to handle messages received, but again… not a great thing to do in a large application.
A way of testing this flow is:
Very similar to the command line example, but we don’t need to call
There is a lot more to do with Processes and the different variations of them that exist. At this point we should have a basic understanding of the following:
- Everything in Elixir is a process
- We can freely spawn processes
- Processes have one job
- Spawning processes are cheap
- We can pass messages from process to process just by knowing their PID
- How to test basic message passing
It is also key to have this understanding, as the fun OTP based things we get from Elixir are built on these concepts. GenServers. for example, are an insanely powerful tool that are based on message passing around an application.
Throughout this chapter you would have noticed that there are a few awkward moments in trying to test them, such as multiple messages and using timing tricks. In fact, if you are running the test suite locally, there are moments where the tests will fail solely due to the timing / overheads of processes.
There is also a larger consideration here; we are bordering on testing the actual implementation of the language itself. This is rarely a good idea. We have other ways of testing whether messages are being successfully passed around, for example — we’ve done the three previous chapters demonstrating this and we will continue doing it in other chapters.
Jose referred to this in this forum post, coincidentally this is where I got the trace idea from as well.
I hope you enjoy this Chapter. If there are any questions or parts that should get more attention do leave a response, otherwise we will move on to the next topic.