Reactive PHP Events
Distributed. Asynchronous. Magical.
A few days ago, our lovely Development Manager asked me to share at Show and Tell. It’s a Friday-event where a couple of people, here at SilverStripe, talk a bit about some of the recent work they’ve done.
I decided to talk about a ReactPHP-based library I have been working on. It’s near enough to the general asynchronous PHP stuff I talk about that I didn’t need to prepare much. It’s also a topic I could fit into the 20 minute time slot, given it would take at least 10 minutes to give a general introduction to ReactPHP.
I also decided (just this morning) to blog about it. Some of the slides are visually interesting, and I think it’s an easy introduction to ReactPHP. If you’re already familiar with ReactPHP, you might find the first bit simplistic. Skip ahead to where I start talking about event emitters.
It’s also likely that this content will find its way into the Async PHP book I am writing. I promise it will be more in-depth than this post. This should at least give you a feel for some the interesting content contained therein.
So, let’s begin with a summary of the talk.
It’s difficult to distil the theme of this talk into anything else. ReactPHP is an awesome topic on its own, but it’s not the essence of what I’ve recently experienced. PHP is an old language, but we can use it to build new and exciting things. It can work with old concepts (like message queues and event emitters) to create fascinating, new designs.
It can solve strange and wonderful problems. Even when you’re engaged in boring work, or a legacy codebase, you can divert your attention to some small area of study, and find some fascinating work that someone else is doing in that area. As far as creative outlets go; PHP is a treasure-trove.
Let’s divert our attention to the problem ReactPHP was made to solve, and then we’ll look at how ReactPHP solves that problem.
JavaScript-Land
Hands up if you’ve seen (or written) some JavaScript like this:
$(".button").click(function() {
$(".alert").css("font-weight", "bold")
.find(".label").html("You clicked a thing!");
});
Assume it’s using jQuery, and you can probably see exactly what it’s doing. A plain JavaScript representation of this functionality might look something like this:
var buttons = document.querySelectorAll(".button");
Array.prototype.slice.call(buttons, 0).forEach(function(button) {
button.addEventListener("click", function(event) {
button.style.fontWeight = "bold";
var label = document.querySelector(".button .label");
label.innerHTML = "You clicked a thing!";
});
});
While the underlying jQuery code might share some similarities, what I’m representing here is what jQuery could be doing, under the hood.
We can describe this code in a few steps:
- We collect all DOM elements, with the class of .button.
- For each of these elements, we add an event listener. This event listener will be triggered whenever someone clicks on the element the listener is attached to.
- When a .button is clicked, its font-weight is changed to bold.
- When a .button is clicked, the .label element it contains is given the HTML value of You clicked a thing!.
It’s tempting to see code like this and dismiss it as simple, or to be interested only in what happens inside the event listener. It is simple code, and most of the functionality does happen inside the listener function. How often do we think about the mechanism that enables this kind of interaction, though?
Web browsers, and in particular JavaScript engines, enable a concept called an event loop. That’s not to say that only web browsers can have event loops, but rather that the concept is enabled by default, and supremely useful for the interaction that web applications need.
Event loops are like infinite loops that wait for interesting things to happen and then let everyone else know about them. The kinds of events we’re used to seeing (if we’re working within the web browser) are things like click, mouseover and domready.
These are pre-defined event types which make sense to have in the context of a web browser. They’re not the only kinds of events an event loop can deal with, but rather some of the most useful ones in that situation.
There’s nothing preventing the use of filesystem events or socket events from being used with an event emitter. In fact, that’s pretty common too…
I’m not sure why I used a picture of a chicken, to represent NodeJS. Whatever it was, I hope you don’t think fowl of me for doing so…
I actually told that joke and lived. I promise it is the last time. Until the next time.
NodeJS is a JavaScript environment for out-of-the-browser applications. It’s like a JavaScript mainline into your machine. Consequently, you can do all kinds of awesome with it, which would simply not be possible in a web browser. Just look at the API...
I’ve taken the simple HTTP reference implementation right off of the homepage:
var http = require("http");
http.createServer(function (request, response) {
response.writeHead(200, {"Content-Type": "text/plain"});
response.end("Hello World");
})
http.listen(1337, "127.0.0.1");
Ok, I did change it a little. At least I kept the 1337 bit…
GET OFF MY BACK, ALREADY!
These 6 lines of JavaScript don’t do NodeJS justice. They are an elegant facade over all the code required to accept socket connections, parse HTTP messages and interact with the responses.
And there are many events going on too! NodeJS uses an event loop to listen for things like incoming socket connections. That’s what makes that callback (within createServer) possible. If there was nothing to emit incoming socket connections to, how could the listening code know when they happened? If there was nothing to emit parsed HTTP messages to, how could the listening code know when they had been sent?
Again we see simple code representing the awesome utility of an event loop, and we haven’t even begun to look at how it could help us in PHP.
At this point I asked for a volunteer to demonstrate how to build a traditional Hello World application, in PHP.
How do we build a traditional Hello World application, in PHP? It’s not a trick question. Look: I’ll show how:
<?php// ...your application code goes here
We begin with a simple file template. We’ve got to specify that this is a file that should be interpreted as PHP code. We do that by writing
<?php. The next bit can get tricky, so pay attention:
<?phpprint "Hello World";
Ok, it was a bit of a trick question. But only because the process of creating a Hello World application is deceptively simple.
That’s because half of the work is being done elsewhere. More specifically, Apache/Nginx/etc. are handling all the socket management and HTTP message negotiation. They are divvying up the traffic and sending the relevant requests to mod_php et. al.
By the time us PHP folk get these requests, they’ve already been parsed and prepared. Fantastic!
This is one of the reasons PHP is so popular. It’s so easy to get started. Yet that ease-of-use is limiting our mental model for application architecture. The closest our dear PHP gets to the NodeJS Way Of Doing Things™ is the development server.
The development server is a command-line utility to spawn a tiny request handler. We can run normal PHP code as if it was being interpreted on the back of a successful request to a web server.
Chris Boden and Igor Wiedler joined forces to bring the PHP community a precious gift. And though others have stepped in to help, these gifted developers were instrumental in the evolution of PHP.
Perhaps you think I’m being dramatic. Well zip it! These folks are amazing and ReactPHP is amazing. You just haven’t spent enough time with it to come to that inevitable conclusion.
ReactPHP is a collection of libraries, but those that seem to get most of the spotlight are the event loop and socket/HTTP servers that go with it. Already heard of ReactPHP? It’s probably because of code like this:
$loop = React\EventLoop\Factory::create();
$socket = new React\Socket\Server($loop);
$http = new React\Http\Server($socket);
$http->on("request", function($request, $response) {
$response->writeHead(["Content-Type" => "text/plain"]);
$response->end("Hello World");
});
$socket->listen(1337);
$loop->run();
It looks a lot like the NodeJS reference HTTP server implementation, doesn’t it? There’s a bit more to do with the event loop, but otherwise it’s almost the same. It’s based on similar operating principles, though it has to achieve them using many different tools.
It’s also a little clearer that the event loop is involved. The socket server plugins into it so that incoming socket connections will find their way to code that interprets them. The HTTP server decorates this behaviour, adding HTTP message parsing to the equation.
Finally, we can add our own listeners to the HTTP server, so as to be able to interact with the incoming request and outgoing response. It’s fascinating seeing this within the bounds of PHP. We’ve conditioned ourselves to ignore the socket/HTTP message negotiation aspects of PHP requests.
We’ve learned to give up that control, and getting it back is equal parts interesting and scary. We need to do more to get back to the state we were in, with Apache, but we also get far more control over the request/response cycle.
A Wild Problem Appears!
It may seem like there’s a dependable parity between the NodeJS and ReactPHP environments. This is partly true. We can read and write to the same kinds of files. We can open and communicate through the same kinds of sockets. We can do the same kinds of DNS resolution.
The next bit might sound like I dislike the “old box of tools”. Like many aspects of life, it’s possible to love something while (at exactly the same time) being critical of its bad parts. I love PHP; the community and the language. That’s why I want to push the boundaries of the language and mental models we work with every day.
The trouble is that ReactPHP is working with an old box of tools. PHP was born in a time before JavaScript was a thing. Remember that JavaScript grew out of the competition between browser vendors. Different folks were trying to win the hearts of developers by offering the next best kind of interaction and animation. These new features were rooted in the event loop model, and every bit as asynchronous as the first unobtrusive DOM events.
Don’t get me wrong: JavaScript can be every bit as synchronous as the traditional PHP code we’ve been writing for decades. It’s just that in a web browser, synchronous execution feels sluggish. Battling for the affections of users has pushed JavaScript into an impossible race to be the most responsive, the best looking, the most useful.
Synchronous code is a hurdle that JavaScript developers have long since learned to jump. That’s how young JavaScript developers are taught. It’s a useful mental model to have, and so far from what young PHP developers are taught that it’s often difficult to change hats.
PHP was created before non-blocking interaction became the 11th commandment. Over the years a few areas have been gifted asynchronous implementations, but it’s hard to fix a problem people don’t see themselves.
Every time I talk about asynchronous PHP, there’s always that one person who is like “Why don’t you just use NodeJS for that?”
I realise that professional developers use the right tool for the job. If NodeJS is the right tool for your job then use it. Professional developers also have a responsibility to improve themselves and their tools outside of the office. Uncle Bob taught me that, so it must be true!
My point is we can push the limits of the language and user-land code. You don’t only need to write code that will go into production. You’re already writing throw-away code. It’s just the kind that you need to write before discovering how that production application actually needs to work.
Why not channel a little of that energy (and argumentation) into improving the tools you use to feel yourself? Or learn Go…
What was I saying, again? Oh yes! PHP is mostly blocking, which means we’re still stuck for how to make all this non-blocking HTTP stuff work for us. What good is high-concurrency if requests are still blocking the moment we need to do any meaningful work?
There are many libraries and extensions we could use to solve this problem. ReactPHP deals with the transition from synchronous to asynchronous, but (until the language evolves to include synchronous and asynchronous, in equal amounts) we need something to help us transition from asynchronous to parallel.
We could use any of the wonderful options; like Pthreads and Gearman. Instead I’m going to look at how an old idea, like message queues, can help.
Event Emitters
Let’s first look at an event emitter. There are many PHP event emitter libraries, but the best is undoubtedly Frank de Jonge’s League\Event:
$emitter = new League\Event\Emitter();
$emitter->addListener("foo", function($event) {
print "foo happened!";
});
On its own, this does nothing meaningful. We have created an emitter and added a silly listener. But when we…
$emitter->emit("foo");
…we’ll see the printed message. Don’t be fooled, this isn’t an example of the event loop. There is no event loop here! It’s not even asynchronous. Don’t take my word for it. Let’s look at how the listeners are invoked:
/**
* Invoke the listeners for an event.
*
* @param string $name
* @param EventInterface $event
* @param array $arguments
*
* @return void
*/
protected function invokeListeners(
$name, EventInterface $event, array $arguments
)
{
$listeners = $this->getListeners($name);
foreach ($listeners as $listener) {
if ($event->isPropagationStopped()) {
break;
}
call_user_func_array([$listener, 'handle'], $arguments);
}
}
Each of the listeners is invoked in a tight loop. If you have many, and they all do computationally-intensive things then you’ll see this. If you have few, tiny listener functions, then it’s still blocking/synchronous. It’s just harder to notice.
I’m not hating on League\Event. It really is the best implementation of this pattern, but the pattern isn’t enough to give us asynchronous code.
Furthermore, the listeners and emitters have to be on the same machine and thread. That’s a recipe for disappointment. If we’re looking for simple ways to process in parallel, this library will not be enough.
Message Queues
It’s not hard to guess the functionality of something called a message queue. It’s essentially a queue of messages, a potentially distributed key-value storage implementation. And there are many of these doing the rounds!
My favourite is ZeroMQ. It just so happens there’s a ReactPHP library to use ZeroMQ in the context of an event loop:
$loop = React\EventLoop\Factory::create();
$context = new React\ZMQ\Context($loop);
$socket = $context->getSocket(ZMQ::SOCKET_PUSH);
$socket->connect("tcp://127.0.0.1:5555");
$socket->send("foo");
$loop->run();
This is the first half of working with a message queue, and can be described in the following steps:
- We create the familiar event loop, and a ZMQ context to use it. The context is a factory for creating different kinds of ZMQ sockets. We’ll only use the most basic ones for the purposes of this post.
- We connect to a remote location (which happens to be just another part of our local machine).
- We send a message through that connection. This gets stored on the other end of the line, but we won’t know that until we write the code that pulls it out.
- Since this is ReactPHP-based, and we’re using the event loop, we also need to run the loop. As with the HTTP server example, that turns this into a long-running script.
In a separate tab, we need to run a script, containing something like this:
$loop = React\EventLoop\Factory::create();
$context = new React\ZMQ\Context($loop);
$socket = $context->getSocket(ZMQ::SOCKET_PULL);
$socket->bind("tcp://127.0.0.1:5555");
$socket->on("message", function($message) {
print "message received: {$message}\n";
});
$loop->run();
It’s almost identical (which is one of the things I love about this particular queue, and related libraries). Notice the messages are handled within a callback? This particular ZeroMQ library is asynchronous. We can still block the thread by doing something blocking within that callback. Sending and receiving messages is asynchronous.
Event emitters and message queues may seem unrelated, and on their own they can’t help us transition from asynchronous to parallel execution. Together they can give us a good start.
Event emitters are synchronous code masquerading as asynchronous code. Message queues are key-value stores. Together they can give us distributed, asynchronous events.
A while ago I began working on this problem. At first I tried running parallel code with Gearman. Gearman is great, but it has its faults. Ultimately they prevented it from being an easy solution to a difficult problem.
Gearman has two modes. The first is synchronous. The benefit of this mode is that you can attach callbacks to the client, and receive feedback from the worker. Unfortunately, if you decide to the use the second (asynchronous) mode, you lose the callback support.
If you push tasks onto the background queue, Gearman becomes little more than a message queue abstraction. The trouble with message queues is that they are weak. They don’t do any meaningful work until you give them the tools to do so. So Gearman is slightly more helpful, in this regard.
Another problem with message queues is that they are unidirectional. This makes them great for things like parallel execution of tasks that you no longer need control over or feedback from. Want to send an email, outside of the request/response cycle? Queues (and Gearman background tasks) can help you! Want to parse a log file, and return the results to the browser in the same request? You’re out of luck, friend.
Instead of looking for other message queues/extensions, I decided to make a distributed emitter. It uses both of these libraries I’ve shown, and it’s relatively manageable:
$server = new AsyncPHP\Remit\Adapter\ZeroMQ\Server(
// some boring guff
);
$server->addListener("tick", function ($event, $i) {
print "TICK {$i}\n";
});
$server->addListener("done", function ($event) {
print "DONE\n";
});
$loop->run();
This code bares some resemblance to the League\Event example I showed you earlier. Behind the scenes, it’s using League\Event to store manage the listeners, and ZeroMQ to listen for remotely emitted events.
The companion code to this is:
$client = new AsyncPHP\Remit\Adapter\ZeroMQ\Client(
// some more boring guff
);
foreach (range(1, 5) as $i) {
$loop->addTimer($i, function () use ($client, $i) {
$client->emit("tick", $i);
});
}
$loop->addTimer(6, function () use ($client) {
$client->emit("done");
});
$loop->run();
The server runs during the request/response cycle, and listens for remotely emitted events. The client emits events during the processing of worker tasks. The client isn’t able to add listeners of its own — it can only emit to listeners on the server.
While this doesn’t go the whole way to enabling parallel execution, it is compatible with any parallel execution model that can access the listening server and the emitting client. I have other work, which builds on this to do that remote execution, but it deserves its own post.
I hope you’ve found this as stimulating to read as it was for me to write. I’m excited to write about more of this stuff — the practical application of asynchronous and parallel execution to every-day code. Ask me questions, and thanks for reading!