How Akka Actors work
Akka Actors is a very popular and widely used framework which implements Actors (or for simplicity, lightweight threads) on the JVM. To understand how Actors work, and how they are mapped to JVM threads we created a tiny Actor framework of our own, similar to Akka Actors. The focus is to learn and understand Actor scheduling so a lot of other functionality, including the supervisor implementation etc is omitted.
There is a example code available at https://github.com/unmeshjoshi/actorscheduling
Following are the main concepts of any actor framework.
Actors
Actors provide a higher level of abstraction to write concurrent and distributed programs. Roughly speaking, an Actor is a unit of concurrently manageable work and its associated state. An Actor’s state is strictly private to an actor. Only way to communicate with an Actor is by sending messages to it.
MailBoxes
Actors have mailboxes attached to them. Mailboxes are nothing but a queue data structure, which can be concurrently accessed. Messages sent to an actor are queued in mailboxes.
Messages in a mailbox are dispatched to an Actor one at a time in order.
Dispatchers
Dispatchers the execution engine for Actors. Dispatchers internally manage a thread pool on which Actors are scheduled and execute the message handling code.
To go deeper into how this works, let’s start with some code.
To implement an Actor, we need to have a function which processes messages. Something like following
Then an Actor can be implemented something like following
For executing this Actor, we need a way to attach a mailbox to it and also associate it with dispatchers. This can be done by creating a wrapper over Actor, something like following
As can be seen, an ActorCell can be a generic object, which accepts a Class for Actor implementation and keeps a reference to its receive method.
It’s important to note that, we can not expose this internal implementation to the client of the Actor. How to achieve that?
As we discussed above, only way to communicate with an Actor is to send it a message.
We can define that contract as following
And then have a ActorRef implementation as following
How do clients get this ActorRef?
We can do this by having a Factory method to create an actor, as following
An ActorSystem is the factory which wires together ActorRefs, Dispatchers and Mailboxes.
Now clients can create Actor instances as following
MailBoxes and Dispatchers come into play when the message is sent to an Actor. As we discussed above, The ActorRef is a wrapper around ActorCell which wires MailBoxes and Actor instances together.
When a client sends messages to ActorRef, they get routed to ActorCell as following
ActorCell maintains references to MailBox and Dispatcher. The message is forwarded to Dispatcher immediately
Dispatcher, then enqueues the message and schedules the mailbox for execution if its not already scheduled.
The executor service in case of Akka actors is nothing but a ForkJoinPool. There is a lot of information about how ForkJoinPool works and why its preferred for obtaining high level of efficiency on multicore machines. (http://gee.cs.oswego.edu/dl/papers/fj.pdf)
The MailBox implements ForkJoinTask, along with maintaining a concurrent queue to hold messages
The queue is an instance of ConcurrentLinkedQueue, which can be accessed concurrently to enqueue and dequeue messages. By detault it is an unbounded queue which will grow indefinitely.
The most important part of processing the message happens when the mailbox is scheduled for execution. The mailbox invokes ActorCell which in turn invokes the Actor’s receive method.
Interesting thing to note here is the concept of throughput. Throughput configuration defines how many messages an actor can be processing when it is scheduled to execute on a thread.
It can be number of messages or time based. Time can be set in nano seconds to specify how much time the actor should be processing the messages. The default is 0 meaning no limit. After processing specified number of messages or after the specified time interval, the scheduled mailbox task will return. It will be scheduled for execution again when a message is sent to this actor.
So essentially, Actors are scheduled through their MailBoxes on a ForkJoinPool. It is important to understand this while coding Actors or using any framework like Akka HTTP which is built on Actors to avoid any mistakes like making blocking calls from within Actors.