Programming Servo: Anatomy of a Fetch
Today, let’s go through an entire fetch
in Servo, starting with this example in JS:
Servo implements “Fetch” as specified in the living standard a https://fetch.spec.whatwg.org/.
I’m not going to cover the integration between Servo and the Js engine, so you’re going to have to believe me that the call to fetch
in the above Js code will result in the below Rust code being called.
This function returns a promise, but first presumably somehow “starts a fetch”. How?
Looks like we’ve got:
- a “fetch_context”,
- a “listener”, and
- an ipc channel and router.
- A “core_resource_thread”, to which a message is sent corresponding to the ipc route…
- Finally, the promise is returned.
The “core resource thread” is something the “global” gives us:
So we can tell that a CoreResourceThread
is basically an ipc-sender…
But a sender to what? For that we need to look at another file:
Ok, so it looks like it’s a sender to this channel_manager
, and let’s skip how the globalscope actually gets that sender in the first place, assuming it is done somewhere in the initialization of a “script-thread”.
What happens after this channel_manager
“starts”?
Looks like it is indeed receiving messages, and handling them inside self.process_msg
. Ok, let’s take a look there then:
Alright, we seem to be getting pretty close to the actual fetch
because I can see a call to self.resource_manager.fetch
. What happens there?
Wow, another “context”… In the meantime, we can see that we’re essentially building a thread, and calling fetch::methods::fetch
inside of it.
Taking a step back
Is there a pattern emerging? If anything, in Servo, in order to follow ‘how stuff is called’, be prepared to look far and wide, and don’t give up trying.
On the other hand, nothing can really remain a mystery for long, because it’s all literally there somewhere in a piece of code…
Moving onto our fetch
So here follows basically a skeleton fetch, only showing the basic flow.
Main takeaways:
- “target” implements
FetchTaskTarget
. - We have a
done_chan
, on whichData
can be sent. - We call various
process_
methods of ‘target’. - I’m not going to include it here, but
http_fetch
will do the fetch in a separate thread, and sendData
on thedone_chan
as it comes in over the network. - In
wait_for_response
, we wait ondone_chan
to tell us the response is done, or aborted.
In the below code, I’ve left the step numbering, just to show how there is so much more going on then shown here.
You could take a look at the actual spec, or the original Servo code, for the full frontal.
So what is this FetchTaskTarget
trait?
Let’s follow the target.process_response_chunk
call above. While the above code runs in the ‘main’ fetch thread, and the messages received on done_chan
are coming from a request-dedicated ‘fetch worker’ thread, there is also a third thread involved in this dance: the “script thread”, where the event-loop of the webpage(or a worker) where this request was initiated is running.
Actually, a lot of those “threads” are actually separate processes, I am not particularly precise on this point. An
ipc-channel
is also used in the code.So when you read “thread”, you might want to swap it mentally with “process”, or perhaps just settle for “concurrent component”(or is it parallel?)
So what does this target.process_response_chunk
do?
As you can read, that call will end-up sending a message to another thread, containing a chunk of data. That ‘other thread’ is not yet the script-thread, it will be the router thread of the ipc-sender(I think, actually not sure there).
So how is this new chunk of data going to be handled by the script thread? That will be done through the so-call “networking task source”, which in Servo is basically implemented as a channel.
However, what piece of code is going to call this queue_with_canceller
?
First of all, it turns out there is another trait involved:
Ok, so we can sort of tell that a method called process_response_chunk
will end up being called on a “listener” of sorts, by way of yet another Action
trait that will call the process
method of this message.
Let’s go back to the code that actually started this fetch in the first place:
Aha! We use an ipc-channel
router, so that we receive a message, the notify_fetch
method of the listener will be called, with the msg as argument.
What does this “network listener” looks like, you might wonder?
Aha(yes, again)! We now see that the message, implementing Action
, ends up inside a ListenerTask
, and queued on the networking task source.
This is actually extremely interesting, because it shows how something, here a “fetch”, happening in one thread, will affect the world of another thread, the script-thread where you favorite web page is running, by having some task queued on it’s event-loop! Heureka, there is no shared state, only inter-thread messaging, with the handlers of such messages queuing “tasks” on the event-loop of the receiving thread.
So how does such a ‘task’ get executed on the target event-loop? Well, there is yet another trait for that, called TaskOnce
. Let’s skip the trait definition, and look at how ListenerTask
implements it:
Wow, I think I actually see a call to a process
method of an action
right there! Now go back to one of the gist a little bit above, what happens when the process
method of a FetchResponseMsg
is called? It simply matches on self, and depending on the kind of message it is, calls the appropriate method of a “listener”, implementing FetchResponseListener
, passed to it. In this case its our context
that is passed to it, and what is our context? It’s a FetchContext
, which implements a process_response_chunk
method, yep, there it is:
There we are, finally, a message was sent from the fetch thread, when it was handled by the ipc router, it resulted in a task being queued on the “networking task source” of the webpage event-loop, and when that task is finally run, it would call a method on the “fetch context”. Note that the promise is resolved in process_response
, which is ‘fired’ in response to the headers being received, and after that the actual data is written to the body as each chunk comes in, reflecting the fact that the body
of a response in Js is a ReadableStream
So coming back to our initial example, how is the response turned into a Blob?
Like that:
And I think that’s enough for today…