Programming Servo: shipping message-ports(via a detour into Spectre)

Published in

Programming Servo

20 min readOct 19, 2019

The thing about contributing to Servo is that you keep learning new things about the Web platform.

Personally, I had never used messaging on the web when developing web applications, and I find it a fascinating idea.

Web-messaging enables developers to provide cross-site API’s without having to go through a server, all the while leveraging the client-side security model of the Web. And since it happens on the client, it could be more transparent to the end-user, and probably easier to block if necessary.

Implementing message-ports also raises interesting architectural questions. In an earlier Web(like, in 2017), an API like message-ports could have been implemented with some sort of cross-thread communication. In 2019 however, it’s going to have to go across process. Why? Something known as “Spectre”.

How I learned to stop worrying about Spectre, and love IPC

The difficulty lies not so much in developing new ideas as in escaping from old ones. John Maynard Keynes

Heard of “Spectre” already? Wondering whether it’s a “big deal”, or whether it can actually be exploited or not?

After reading “Spectre is here to stay: An analysis of side-channels and speculative execution”, written by the team at Google working on V8, I got really excited. Actually, it might be the article that got me the most excited in a while.

Why? Because it’s a paradigm shift, and the point is not whether it’s an easy exploit.

Here is Spectre in one sentence: “Inside a given process, nothing is private”.

What does that mean? Basically, when it comes to privacy of data, the Rust compiler has now officially been downgraded to a linter.

Declaring members of a struct without any pub and carefully not making them available to other code via any public method? The compiler will indeed complain if you or another developer tries to compile code accessing those private variables.

But once that code runs, you have to assume everything is readable by any other code running in that same process.

Now, this might strike you as a bit of a flippant statement or some wild exaggeration, and indeed, the code you carefully either wrote, or reviewed, personally, isn’t going to spontaneously mount a Spectre attack against itself once it’s running somewhere and you’ve turned your back on it.

So Spectre is mostly relevant if your code is meant to run(alongside?) other people’s code, for example a Web browser running some JS or Wasm code downloaded over the internet, and if you’re trying to build some sort of security boundary between “other people’s code” and your own, or between different “other people’s code” coming from different sources.

And the good news is, there is an upshot to all this, and it’s actually pretty encouraging.

There is one “very effective” form of mitigation against Spectre: process isolation.

And that doesn’t even mean some sort of complicated sandboxing of processes. Basic multiprocessing will get you there. The difficult part is probably updating your mental model.

Whereas previously you might have build security boundaries in your code using language constructs, there now remains only one game in town: if you don’t want it readable to some “other code”, do not move it to the process where that “other code” is running.

To better understand what this means for the Web, let’s take a look at the typical browser architecture.

On strange loops, secret agents, shadowy groups, and hidden auxiliaries.

Let’s say that you’re reading the current page in a tab in your favorite browser, with the url in the address bar pointing to https://medium.com. Then, you happen to have another tab open at https://doc.rust-lang.org.

So far so good.

Now, both tabs represent what is known as a “browsing context”, which contains a “window”(actually, a “WindowProxy”, but let’s not go there for now), with such a window not being equal to a window running on your OS, more like “something managing web-content”. Hence your tab contains a window, but that window might contain other windows, in the form of, you’ve guessed it, iframes.

Furthermore, you might have noticed that webpages have moved on from being purely static documents, and that what you’re looking at appears functionally rather similar to an application.

What is the environment in which this application runs? It’s known as an event-loop, and browsers have various types of them. Let’s focus on just two for now: “window” and “worker” event-loops.

Oh yeah, I forgot to mention that your “browsing context” contains code that could spawn various background workers, so there could be more stuff running than meets the eye. Anyway, workers are easy, because they always run on a dedicated event-loop. It’s window event-loops that are a bit harder because they can end-up being shared among windows.

The spec reads that: “A window event loop is the event loop used by similar-origin window agents.”

Ok so this is the point were you think I’ve lost it, because I just told you a “tab” contains a window/browsing context, which runs on an event-loop, but now it says the event-loop is used by something called an “agent”? So I forgot to mention that the window actually (sort-of)belongs to an “agent”.

It’s not so much the window that runs on the event-loop, it’s the agent(or it “uses” an event-loop to run windows?), and there are different types of agents. The two we’ll mention for now are, you’ve guessed it: window agents, and worker agents.

Ok, and “window agents” is actually a misnomer, because the correct name is “Similar-origin window agent”.

Ahah! The “similar-origin” part of the name sort-of implies there could be more than one window in this thing, grouping them by the “similarity” of their origin?

Yeah that’s pretty close. The definition reads “An agent (..)whose set of realms consists of all realms of all same-agent Window objects.”

Same-agent Window objects? ***!? Does this thing ever get to the point?

Ok, once you get over the initial “you call that a standard, I call it madness” reaction, you get the idea: a “similar-origin window agent”, is an agent using a window event-loop, and it runs a bunch of windows who are all considered to be “same-agent”. What this means is that more than one window will run on the same event-loop(more precisely, be part of the same agent), in some cases.

Ok, so when can you consider windows to be “same-agent”?

Two things:

They need to be part of the same “browsing context group”,
Their origin needs to be “similar”(let’s keep it simple for now).

Let’s get the origin part out of the way first: https://doc.rust-lang.org and https://medium.com are different origins.

https://help.medium.com/ and https://medium.com are also different origins, but they are similar enough for the “same-agent” test.

So will they always be part of the same “similar-origin window agent”? Nope. Only if they are also part of the same group.

How does that association happen? As far as I know, there are only two ways:

The windows need to have a parent/child relationship, for example through an iframe nested in the parent’s document. In that case the child is known as an “nested browsing context”.
One window needs to have been opened by the other in another tab or pop-up, using an API like window.open or a link with target=_blank. Such a “Tab-or-pop-up-opened-by-another-page” is also known as an “auxiliary browsing context”.

So, to recap, if you manually open https://medium.com in one tab, and then https://help.medium.com/ in another, they won’t belong to the same “similar-origin window agent”, because even though their origins are similar enough, they’re not part of the same group.

If https://help.medium.com/ is nested via an iframe on https://medium.com, or if it was opened in a new tab or pop-up via a call to window.open, both windows will be “same-agent”, and run on the same “similar-origin window agent”, running a “window event-loop”(unless the noopeneroption is used, see Step 15 of the “window-open-steps”).

Actually, an event-loop could also be shared between “similar-origin window agents” and, for what it’s worth, you could run all window-agents on the same event-loop. But that’s probably an archaic approach to things, unless you’re building something like a headless browser without any security requirements just meant to index public pages.

I told you workers were easy, workers always just run in their own event-loop!

You start by making a big deal about processes, and now you go on forever about event-loops, where’s the point?

Ok, so, in a “Pre-Spectre” world, an “event-loop” running one or several similar origin windows, might have simply be implemented as a thread running tasks sequentially.

In a “Post-Spectre” world, it is, or soon will be, implemented as a thread running tasks sequentially, inside a separate process.

Why? Two reasons:

You don’t want a page to be able to read stuff that never is supposed to be available to it. Remember “inside a process, nothing is private”? So if you run JavaScript in a thread inside a unique browser process, the JS could read all the memory of the entire browser.
You don’t want a page running on one window event-loop, to be able to read a page running on another window event-loop. In other words, no matter how much you enjoy this article, you wouldn’t want the code running at https://doc.rust-lang.org in your second tab to be able to read it as well.

Fortunately, as you’ve undoubtedly just meticulously read(the previous section was not something you’d skip, right?), browsers already have a pretty well developed model of isolating pages in different event-loops, so it doesn’t take a large stretch of imagination to see those event-loops running not only in separate threads, but also separate processes. And in fact, browser have been going down that road for a long time now, for various reasons, even before Spectre(almost as if someone saw that one coming).

The one problem that arguably remains is the whole “same-agent” business, which means that different yet similar origins are supposed to run on the same event-loop, so it’s hard to isolate them in separate processes.

So if “main-page.**.com” opens a new tab for you to go to “secure-checkout.**.com”, thinking they would be isolated(from a bunch of potentially insecure CORS stuff on the main page?), in a post-Spectre world they wouldn’t be isolated at all, since running in the same process would defeat any language-enforced cross-origin check.

Note: there’s a fix already available to the above problem: if you use window.open or target=_blank in a link, in both cases you can use noopener to prevent the new tab from being part of the same “similar-origin window agent”. Using noopener means the new tab will run in it’s own “similar-origin window agent”. How do I know? Because it says so in the spec: https://html.spec.whatwg.org/multipage/#browsing-context-names:creating-a-new-top-level-browsing-context-2

And what about message-ports then? Let’s finally get to those…

“it’s complicated”: on ports, channels, and entanglement

Message Ports provide the basic building blocks for channel messaging on the Web, providing the ability for web developers to send messages across browsing contexts and into workers, to be delivered in the form of DOM events.

Most importantly, ports themselves can be transferred from one event-loop to another, allowing the developer of one page to distribute communication channels to other pages/workers.

Ports can also be viewed as the basis of an object-capability model on the Web, where code running in one origin could provide a specific “capability” to code running on another origin. Not only that, but that “capability” could very well be allowed to be shared beyond that second origin, since a transferred port can be transferred again. So a “port” could represent the “capability” to do something, from any origin that happens to have received that port via one, or several, transfer steps.

Message ports are simple objects at heart, but they can get entangled in devilishly complicated relationships.

A message-port has:

A message-queue, which can be plugged into the event-loop of the window or worker that happens to “own” the port at a given time,

and can be:

Entangled with another port,
Used to send message to the entangled port, via a call to postMessage .
Itself be transferred as part of the transfer argument to the postMessage method of a window, a worker, and a port(but not it’s own).
Be enabled, either via a call to start or by setting an onmessage handler, at which point the messages in it’s queue will be “handled” by the event-loop where the message-port is found.
Be stopped, at which point you can’t use it anymore.

What does this have to do with Spectre and multiprocessing? Well, in a post-Spectre world, implementing message-ports means you’re implementing a means of communication across event-loops, and those event-loops will be in many cases running in separate process, so that’s going to involve “a whole lot of IPC”, and you know what, that’s actually “a whole lot of fun”.

Implementing MessagePorts in Rust

I’ve explained elsewhere how event-loops work, and why they’re an excellent idea not just to run JS, but also to structure you own Rust code.

In one sentence: an event-loop is an abstract thread of execution running tasks sequentially, and event-loops can communicate with each others by enqueuing tasks on each other’s queues(which can be implement via, you’ve guess it, message-passing).

Now you also know how browsers have a bunch of separate event-loops, both for windows and workers. And that’s just on the side “visible” to JS, implementations tend towards message-passing and event-loops for implementing stuff “not-visible” to JS.

By the way the article in the link above, https://www.chromium.org/developers/lock-and-condition-variable, might be one of the best “how to do multi-threading” out-there, even going as far as to contain a “The fallacy of thread context-switch cost” paragraph. An absolute must-read.

And now you also understand why in a post-Spectre world these will also be separated in different processes(detail: dedicated workers can just run in the same process as the window or worker that spawned them, in fact, they probably should because they would be part of the same agent-cluster, which is a nice topic for another blog post).

The basics of multiprocessing in Rust:
For working with processes: https://doc.rust-lang.org/std/process/index.html
For working with sand-boxed processes: https://github.com/servo/gaol
For inter-process communication: https://github.com/servo/ipc-channel/

So how can you implement something that does the below:

Brief interlude: in case you’re wondering where this async_test above is coming from, it’s a web-platform-test, and basically all the implementer of Web clients share a test-suite, and when you work on a feature in say, Servo, you can add some tests and they will be automatically up-streamed. Pretty cool, right?

As you can see in the test, we’re:

Enabling port1, via channel1.port1.onmessage .
Enabling port2 in the same way.
Sending a message to port1 , via channel1.port2.postMessage(0), expecting it to be received and handled in the local onmessage handler setup at 1.
Inside the onmessage handler, we send another message to port1, and we then transfer it to the iframe via TARGET.postMessage(“ports”, “*”, [channel1.port1]).
The code in the iframe, not shown here, will, once it receives the message containing the transferred port1, setup the onmessage handler on the received port, and receive the message sent to port1 at 4, and immediately ping it back to port2.
port2 is then expected to receive the message sent from the iframe at 5.

So, I’ll do my best to stick with the essentials, and unfortunately, this is still going to take a loooong time…

The Director’s cut of the below can be found at https://github.com/servo/servo/pull/23637

The various pieces of the puzzle

We’re lucky this time, because there are only two components of Servo involved:

The Constellation, and
Script.

The constellation runs in the “main process”, alongside the embedder, and script consists of one process per “agent-cluster”, containing among other things: a window event-loop, one or several (dedicated-)worker event-loops, and some additional threads(layout, hang monitoring, worklets, etc…).

The constellation can be thought of as a central component that keeps track of the various script components, and can be used as a broker between them if necessary. It keeps a reference to each running script, by way of an IPC sender. Each script process also keeps an IPC sender to the constellation.

Now, first problem: since ports can be created in, and transferred into, both windows and workers, we would ideally find a place to manage ports that is available in both scenarios. Enter: the GlobalScope. Think of it as the “master glue” of an agent, and the beauty is that both workers and windows have one.

So, the GlobalScope is where we’ll be managing ports.

Now, let’s start with that initial call to var channel1 = new MessageChannel();

But first, a little note on some Web platform awesomeness.

WebIDL, and the beauty of standardized and shared infrastructure.

To get something like MessageChannel to interface between Rust and JS, all you have to do is, first, add a file containing something like:

then, this will automatically generate a huge file containing bizarre-looking code like:

Then, the code that you actually have to write yourself will look like:

And that’s it, the JS call to var channel1 = new MessageChannel(); will call into your impl MessageChannel, awesome, right?

While the Rust bindings are probably Servo specific, note that the general mechanism, called WebIDL, is not. It’s in fact a “standardized” part of the Web platform, as described at https://heycam.github.io/webidl/

Back to Messageport

Back to our JS test case above, let’s go through what happens when var channel1 = new MessageChannel(); is called.

Note the call to incumbent.track_message_port(&port1, false); in the code snippet above, that’s a method call on GlobalScope right there, which will start to “track” a newly created message-port.

Why does the global want to track it? To orchestrate the “lifecyle” of a port:

It will route incoming message to a given port.
It will orchestrate the potential transfer of a port.
Finally, it should garbage collect a port that has become unusable.

Let’s take a look at what that method looks like:

And here is what message_port_state looks like:

So, just going over the key points for now:

If we’re not managing any ports yet, we setup the initial plumbing required for doing so, which includes setting up the required communication with the constellation by sharing a sender to our “router” for managed ports. A good example of “sharing senders, keeping receivers private”.
We send a ScriptMsg::NewMessagePort message to the constellation, letting it know about the new port.

Yes, as you’ve correctly read above, this is not the whole story, we actually run different logic based on whether the port was “transfer-received” or newly created. More on that later.

On IPC and routers

I want to go a bit further into this Router business, because it’s actually really interesting. This is using a router from servo/ipc-channel, and it’s an incredibly useful construct.

Here’s why: on an event-loop, you can’t receive IPC messages directly, the main reason is that a send on an ipc channel can block unpredictably(even though semantically the send is non-blocking), if the underlying buffer is full, and this can cause deadlock across the system.

Take this scenario: the constellation sends an ipc-message to a script process, and script itself is blocking on the constellation(Script sometimes ask the constellation for something in a “sync” way). Most of the time this is fine, when the send in the constellation is non-blocking, but if the ipc buffer is full, the constellation suddenly finds itself blocking on script, which is itself blocking on the constellation.

Deadlock.

The way to fix this is for the “receiving” of ipc message in script to happen on a dedicated per-process “router” thread, which will never block(it doesn’t send ipc messages, and it should only “route” the message to the appropriate in-process thread via a non-blocking send on a unbounded crossbeam channel).

So we setup a route, where each message will call the notify method of a MessageListener, on the router thread.

And how are those incoming ipc messages handled by the MessageListener?

Two things are done:

Enqueue a task back on the event-loop(the one we started on, when we setup the route to receive the ipc). This is internally done simply by sending a message on an unbounded crossbeam channel.
Inside the enqueued task(which will run on the event-loop when it receives the message containing the task), use the global scope to “route” the message to the relevant port. Note that this “task” will then dispatch a MessageEvent, (inside route_task_to_port ), which will trigger the onmessage handler, meaning some JS code will run. So the “event-loop” runs Rust code, that often then calls into JS.

Why is a task queued, instead of running the “message-receiving” task immediately from the router thread in response to receiving the IPC message?

Because the processing model of the event-loop relies on sequential task handling, meaning we can only manipulate DOM objects through a task running on the event-loop, one at a time.

If we were to handle the port message using the port immediately on the router thread, that would give us execution of tasks in parallel to the event-loop, hence breaking the sequential nature of it’s processing model.

The HTML standard is pretty clear about that, since it does contain a lot of “in-parallel” steps, and those must always “affect the world of the event-loop” via a queued task, not directly from the parallel algorithm.

(..)in algorithm sections that are in parallel, you must not create or manipulate objects associated to a specific JavaScript realm, global, or environment settings object. (Stated in more familiar terms, you must not directly access main-thread artifacts from a background thread.) Doing so would create data races observable to JavaScript code, since after all, your algorithm steps are running in parallel to the JavaScript code (Source: Dealing with the event loop from other specifications).

Secondly, we’d risk blocking the router thread for a long time, or even deadlock, in the light of the possibly blocking ipc send (a JS task could end-up sending a ipc message to the constellation, sometimes even blocking to wait on the response, so if the constellation it then itself blocked on a send, you get a deadlock).

Also interestingly, we do pass a reference to the GlobalScope around to the router thread, and then back on the event-loop. That’s using Servo’s “magic” Trusted wrapper, which allows us to send a reference to the GlobalScope around threads, but only so that we can then enqueue a task back on the event-loop which will use that GlobalScope. We can’t actually use the reference to the scope on any thread but the event-loop thread where it “originated” from, but we can pass it around to the router thread just for the purpose of including it in a task enqueued on the event-loop from the router thread.

When the enqueued task executes on the event-loop, and the GlobalScope is used, it will be as if it had never been shared with the router thread.

About that postMessage…

Let’s move on to the call to TARGET.postMessage(“ports”, “*”, [channel1.port1]); where TARGET is a cross-site iframe.

So, note that ports, but also windows, and even workers, have a postMessage method. So, in the case of a cross-site call to postMessage , say, via a cross-site iframe like in the test above, an IPC message will have to be sent to the constellation, which will then “forward” it to the relevant process, where a task on the relevant event-loop will be enqueued in order to “receive” the message.

And, while iframes and workers cannot be transferred themselves, you can transfer port into them as part of the postMessage call.

How does that work? Let’s look at the cross-site iframe case.

The above is the Rust equivalent to the JS postMessage , with different variants based on whether an “options” object is passed as argument or not.

As you can see, the transfer argument is received as a CustomAutoRooterGuard<Option<Vec<*mut JSObject>>>, and also note the message: HandleValue, both are examples of the “magic” of the JS bindings in servo.

And as you can also see, we then pass it to the following call: let data = structuredclone::write(cx, message, transfer)?;.

This calls into an API provided by SpiderMonkey, resulting in the JS message being “cloned”, essentially turned into a value independent of the current JS execution context. Regarding the transferring of the ports, well, we’re going to have to do it ourselves, via a callback that the cloning API calls into, because it doesn’t know how to handle message ports directly.

That callback then calls into the Transferable trait, that is implemented for MessagePort in the following way:

Essentially the clone API makes available some raw pointers, including one *mut u64, via a call to a method of the port itself, which we use to write the unique Id of the message-port. The “clone” API then spits out essentially a Vec<u8> containing both our automatically cloned JS message, and the “manually cloned” Id of the message port. That Vec<u8> can then be sent on a message over IPC.

Wondering how Servo can have a u64 represent something unique across threads and even processes? Checkout the PipelineNamespace concept.

At that point, importantly, the message port that is being transferred will notify both the GlobalScope and the constellation that it has been shipped out of an event-loop.

Finally, the call to post_message of the iframe sends this Vec<u8> along with other info, to the constellation.

(A DissimilarOriginWindow is basically a cross-process proxy to the actual Window running in another process).

Upon receiving this Script::PostMessage message, the constellation will forward it to the process where the iframe is actually running. There, a task will be enqueued, to “handle” the message on the event-loop.

How does receiving that message look?(note: the below is called post_message as well, and it’s the method of the actual window inside the process where the document of the iframe is running, whereas previously we called the post_message method of essentially the iframe in the other process…)

The main thing to note is that we receive a StructuredCloneData , which is basically the Vec<u8> we wrote in the other process.

Then, we queue a task on the event-loop to actually “decode” the message, which, if it contains any tansferred ports, calls into the transfer_receive method of MessagePort for each tranfserred port, and dispatch a MessageEvent containing the results to the JS.

On the JS side of things, it will look something like:

Ok so that was messaging with an iframe, what about a port?

How does “messaging” using a port look like? Well, it’s actually very similar to the above. The only difference is that we check whether the port has been transferred or not. If not, it just does the messaging via a task enqueued on the local event-loop. If it has been “shipped”, it sends an IPC message via the constellation, and upon receipt in the other process a task will be enqueued to dispatch a MessageEvent, via the call to GlobalScope.route_task_to_port that you saw earlier when we discussed initially creating message-ports.

Below is the post_message implementation of MessagePort, which will look eerily similar to what was done above with the window.

The above then calls into a corresponding method on the GlobalScope, which looks like the below:

What is different from the window steps above, is that we’re not immediately sending the message, instead we queue a task on the local event-loop to do it later. The reason for this is that the entangled port to which the message is being sent to, might still be transferred in the current task, the same task where postMessage is called.

If the entangled port is transferred-out of the current global, by the time the task dispaptching the MessageEvent would run, the entangled port would be gone from the current global and the message lost.

By only running the rest of the “postmessage steps” in a subsequent task, we can be sure that the “state” of the entangled port is stable, and we will know if it has been transferred out or not and can act accordingly.

“Routing a task to a port” from the GlobalScope looks like:

As you can see, several things can happen:

The GlobalScope is not managing this port, or any other port, and the message is re-routed via the constellation.
The message is buffered by the port, because it hasn’t been enabled yet.
The message is dispatched via a MessageEvent
An error occurs during the deserialization of the message, and a error event is dispatched.

The dispatching of MessageEvent is essentially the mirror of what was done earlier using the Window.

What happens when the task is re-routed by the constellation? Essentially, an IPC message will be sent to the event-loop where the entangled port is found(potentially after some buffering by the constellation, if the port is still in-transfer), and that IPC message will be received and handled by the “router” that we saw being setup as part of track_message_port.

It will look like:

And as you can see, this simply calls again into route_task_to_port of the global that is actually managing the port.

The end of this article, and the beginning of your contributions to Servo?

This was probably the longest article I wrote so far. As you might have noticed, the whole ‘message-port’ business was really just an excuse to rattle-on about Spectre.

However, if you think this “cross-event-loop” and “cross-process” type of coordination work looks interesting, you could take a shot at it yourself.

How? By implementing BroadcastChannel .

Unlike a MessagePort, a BroadcastChannel cannot be transferred between event-loops, hence they could re-use the patterns used by MessagePort, but without the transfer business.

So it should actually be easier, and that’s not why you should do it.

So why then should you do it?

But why, some say, Servo? Why choose this as our goal? And they may well ask, why climb the highest mountain?
Why, 92 years ago, fly the Atlantic? Why go to the moon in ‘69? Why, in 2019, rewrite layout?
We choose to contribute to Servo in this decade, and do the other things, not because they are easy, but because they are hard.

And finally, “hard” is in the eye of the beholder. One of the issues below is actually marked “less-easy”, yet it will probably be “hard” for a beginner. And the only, and actually very simple, way to get better is by doing things that are progressively harder.

Implement BroadcastChannel · Issue #21025 · servo/servo

github.com

WebMessaging: fully support MessageEventSource · Issue #24454 · servo/servo