Celluloid internals: Proxy and Call objects

In line with the previous two posts about concurrency primitives and abstractions in Ruby and EventMachine and the reactor pattern, I intended to cover Celluloid’s core ideas and internals in the last post of the series. However, in the process of writing it, I realized that Celluloid was much bigger than EventMachine. Consequently, this will instead be a series of posts, each covering an essential part of Celluloid.

The Actor model

First, let’s define what the Actors actually are.

An actor is a computational entity that, in response to a message it receives, can concurrently:
* send a finite number of messages to other actors;
* create a finite number of new actors;
* designate the behavior to be used for the next message it receives.

Translation: Actor model is a system inside of which computation is done only by sending messages. This is an ideal abstraction. Actual implementations of this model rarely limit themselves to doing computation only in this way. This is also the case with Celluloid.

An actor-based system is somewhat similar to an object-based system in that entities communicate with each other by sending messages. The difference between these two conceptual models is that in object-based systems computation is sequential while in actor based systems computation is concurrent. In that regard, the actor model more accurately describes the real world, because everything happens concurrently in the real world.

Since Ruby has no built-in Actor facilities, Celluloid has to build them.

Transforming objects into something more

When we want to turn our object into an actor, we include the Celluloid module into its class. Once the Celluloid module is included, an included hook gets called and the transformation process begins:

Actor creation

First, the target class gets extended with Celluloid’s class and instance methods. Next, the target class gets the ability to define inheritable properties which are a cross between class-instance variables and class variables. Celluloid uses them to save configuration settings in the class’ scope. Finally, celluloid defines three properties that make its instances actors:

  1. mailbox_class property,
  2. proxy_class property,
  3. task_class property.

These three properties make up the foundational parts of the library and we’ll have to cover each in detail if we hope to understand how Celluloid works. But, first things first:

Constructor overriding

We can see in the snippet above that Celluloid hijacks the target’s constructor method by overwriting the new method. Instead of returning instances of the target class, it starts returning proxy objects.

Proxy objects

The only thing we need to know about the Cell class at this point is that it knows how and what kind of proxies. It does that in the last line of its constructor method.

Proxy creation

Proxies are objects that intercept regular method calls and convert them to something Celluloid calls inter-actor message protocol. The Cell factory constructs objects of class Proxy::Cell but the actual meat is in the Proxy::Sync and Proxy::Async classes which define what happens when sync and async methods are called.

The sync and async protocols boil down to intercepting method calls with Ruby’s method_missing facility, wrapping them into Call objects and pushing them onto a Mailbox to be processed.

Sync method interception

We can see that both proxy objects are intercepting all method calls and pushing them onto the mailbox. There are two important differences between them, though:

  • SyncProxy objects invoke a value method that will ultimately respond with a result to the sender and AsyncProxy objects don’t (they just return self),
  • SyncProxy and AsyncProxy objects wrap the method call into a SyncCall and AsyncCall objects respectively.

Call objects

Call objects represent requests towards an actor. They sit in the mailbox and wait patiently to be processed by the actor. SyncCall and AsyncCall objects represent synchronous and asynchronous requests (method calls) respectively.

When their time is up, they are processed by the actor (usually in the context of a Task) by invoking their dispatch method. The dispatch method basically just invokes the method that was intercepted in the Proxy object by using Ruby’s public_send method:

# Call dispatch

In addition to invoking the intercepted method, the SyncCall object pushes the response onto the sender’s mailbox.

result = obj.public_send(@method, *@arguments, &_b)
@sender << result # @sender == sender's mailbox

Proxies and calls are the foundations of what Celluloid calls inter-actor message protocol. Once (partially) broken down, we can see that the basis of that protocol is Ruby’s method_missing extreme late-binding facility.

Many articles have already been written about performance implications of Ruby’s method_missing. Even without considering method_missing penalties, Celluloid wraps many things onto every method call increasing memory usage significantly.

Consequently, this means that there is a significant performance penalty to using Celluloid in your codebase. Every abstraction has a cost, but in my opinion, Celluloid is worth it in most cases. Even Mike Pernham says so, and he removed it as a dependency from Sidekiq.

Originally published at pltconfusion.com.