Continuations, coroutines, fibers, effects
Theory and practice in React 16
That said, let me underline that when it comes to just using React, you do not need to know any of this abstruse theory. Getting to know React under-the-hood is a fascinating undertaking, but it’s a pleasure, not business 😉.
This post builds on a talk delivered at the React Advanced Meetup (@ReactAdvanced) at Facebook London. It covers the material in more detail and attempts to answer some of the questions raised. This post aims to be fairly comprehensive and is therefore long. Here is a table of contents to help you navigate:
- What is React Fiber?
Part 1: Theory
→ Delimited or undelimited?
→ Stackful or stackless? One-shot or multi-shot?
3. Effect handlers
4. Coroutines and fibers
Part 2: Practice
5. Continuations revisited
→ Continuation delimiters revisited
6. Effect handlers revisited
7. Coroutines revisited
Sitting comfortably? Then I shall begin…
…And do not forget, if you like this post, clap 👏 , share, and follow me on Twitter 🐦! It means a lot!
1. What is React Fiber?
Briefly, React Fiber is a rewrite of React’s core reconciliation algorithm to provide greater control over the render cycle. Fiber’s exciting features depend mostly on a new superpower: the ability to pause and resume in the middle of a render cycle. Some of those features are:
- Time slicing: a single render cycle is broken into small chunks which can be processed at different times. A high priority render can interrupt a low priority render, which can be resumed when the high-priority render is complete.
- Suspense: a component can make an asynchronous request for data, suspend any further processing of itself until that request is complete, and then resume with the data.
- Error boundaries: errors can be caught anywhere in the component tree and rendering can be resumed from the point at which the error was caught.
If any of this is new to you, go watch Lin Clark’s A Cartoon Intro to React Fiber, and follow it up with Dan Abramov’s Beyond React 16. For the implementation background, check out Andrew Clark’s react-fiber-architecture repository.
The key takeaway from the react-fiber-architecture repository is the following:
Fiber is a reimplementation of the stack, specialized for React components. You can think of a single fiber as a virtual stack frame.
The advantage of reimplementing the stack is that you can keep stack frames in memory and execute them however (and whenever) you want.
Of course that phrase “in a manner of speaking” hides a lot of ambiguity.
So let us see what we can do about that.
One-shot Delimited Continuations with Effect Handlers
Has anyone previously proposed a Algebraic Effects (e.g. Eff like handlers of continuations) for ECMAScript? #lazyweb I…
(Side-note: this is not a picture of Sebastian Markbåge; I don’t know who this is 🤷.)
Continuations are the most basic of the various language features we are considering, and the basis of the others, so let us begin there.
A continuation is a control flow primitive, which means it is in the same category of things as while loops, if/else clauses, and functions: it is something we can use to avoid having to write code in the order we want it to be executed.
Specifically, a continuation is commonly defined as an abstraction to represent “the rest of the computation”…whatever that means.
This standard definition confuses me. Do we not already have an abstraction to represent “the rest of the computation”? It is called the stack.
The stack is a list of jobs. When the stack is empty, it means the program has terminated. Each job is represented on the stack by a “stack frame”, which enables the data required for a particular function call to be efficiently located and deallocated when no longer needed.
The defining feature of the stack is that stack frames are ordered by their physical location in memory. Each frame is placed at a memory address following the previous frame. This sequential storage is what makes the stack such an efficient way to queue and process jobs, since the data needed for the current computation can be located simply by keeping track of the end of the stack. Sequential ordering has an important limitation, however: we cannot change the order of execution of jobs (frames) once the stack has been built up. What goes up must come down, and in exactly the same order.
Now, let us imagine for a moment that instead of storing frames sequentially in memory we stored them at random locations on the heap. We would need some other way of defining the order in which frames should be processed: we would need to include a pointer in each frame to point to the location of the next. In other words, we would store our queue of jobs as a linked list. In order to return from a subroutine, our runtime would use these pointers to jump to the next stack frame.
Those pointers that point to the frame to continue with when the current one is done… let us call them continuation pointers.
What is the major advantage of linked lists? Insertion and reordering of nodes is easy.
So, let us suppose that our language has the ability to “reify” the process of returning from a particular subroutine and give it to us as a first class object, a version of
return that we can store, pass around, and eventually call as if it were a function. This ability would be quite pointless if the behaviour of
return were constrained by the order of the stack, since in that case the only possible return would be to the calling function, just as with the keyword
return worked with continuation pointers, however, we could pass it around at will, and invoke it from anywhere to jump to the call frame indicated by the pointer. We could use it to skip forwards in the call chain in order to skip queued tasks, or to skip backwards in order to rerun tasks that had already completed.
If we had such a magical
return function, it would be called a continuation.
At the language level — for example, in Scheme and some versions of Ruby — continuations are usually accessed through a function called
callcc (“call with current continuation”). This is what
callcc reifies the current continuation at the point it is called, and binds it to
cont to be handled inside a callback. From there, we can use the reified continuation however we like. This example would log
hello, call the continuation, which would restart execution at line 3, log
hello again, and so on, in an infinite loop.
callcc gives us the ability to pass data into the next call frame. Any data passed in is then accessed inside the continuation as the result of the call to
callcc, like this:
In this case, we will first see
undefined logged, and then
hello in an infinite loop.
You can begin to see now how continuations might be used within React Fiber to implement something like time slicing. React has a list of jobs that it needs to do, which are waiting on a queue. It pulls a job off the queue and begins processing it within an interval set by the scheduler. When the deadline expires, it can push the last saved continuation back onto the queue and resume it in the next interval.
Delimited or undelimited?
The continuations captured by
callcc are undelimited continuations. A continuation that captures the whole call graph up to the point it is reified is undelimited. It will continue to run until the program exits. Note that undelimited continuations are rather different from functions in this regard, since a function will always return before the program exits and so produce a result that can be handled within the program. Undelimited continuations never return.
Using undelimited continuations for control flow is a bit like using callbacks in Node.js. If we want to handle the result of a continuation, we can only do so inside that continuation, so we need to pass a handler into the continuation when we invoke it. The result resembles the famous pyramid of hell:
Another way of stating the problem: undelimited continuations do not compose.
Suppose instead that our continuations did not capture the whole call chain, but only a certain section up to some delimiter. In other words, suppose our linked list of jobs had an entry/exit point which, instead of terminating the program, returned a computed value to another sequence of jobs; since those jobs are beyond the delimiter, they do not need to handle continuations and might as well be on the stack.
We need some operators to both mark a delimitation and reify a continuation. The most common delimited continuation operators — you can find them in Racket and some versions of Scala — are
reset marks the delimitation. Then
shift does three things:
- It clears the current continuation up to the enclosing
reset. In this example, that continuation is
- It binds that cleared continuation to
- It executes the body of the function passed to
Confusingly, the result of the call to
reset is whatever was returned by the callback passed to
shift. Inside the
shift callback body we can use the reified continuation as if it were a normal function, where the return value of the function is the result of running the continuation. Here we compose its output multiple times to octuple a number.
Heaven knows that
shift can be quite mind-bending the first time you see it. The key to following it is to first ignore everything else in the
reset callback (i.e. the continuation) and treat the function passed to
shift as the ‘real’ content of the
reset callback. After all, whatever this function returns is what will finally be returned from
reset; and if the
shift callback does not call
cont, that continuation will never in fact be executed.
shift is aptly named in the sense that it reifies the body of the
reset callback, removes it, and shifts it into the body of the
shift callback in the form of a function.
With composable/delimited continuations we can do some interesting things. For example, we can refactor callback-style control flow into something resembling async/await or generators. In the example below, we pass the reified continuation directly into an asynchronous function as its callback. Note that the
shifts appear to be successive, but are in practice nested, since the second is cleared and passed to the first inside its continuation.
The resemblance between this code and async/await or generators is no mere chimera, as we shall see in the next section.
Stackful or Stackless? One-shot or multi-shot?
It turns out that
yield is in another, albeit rather more restricted, delimited continuation operator.
The best way to appreciate the proximity of
shift is to examine a generator implemented using
reset. The API of the generator will be as follows. The code below logs
The implementation looks like this:
Let us break this down. It is easiest to read from the bottom up:
- Line 16: we want our generator to return when we call
next, so we wrap
- Line 12: the first time
nextis called, we simply execute the function passed into the generator constructor, and pass in
- Line 5: When we call
yield, we call
shift, save the continuation at that point in
resumeGenerator, and return whatever was passed to
shiftreturns, so whatever was passed to
yieldis also passed back from
- Line 16: the next time we call
next, we call the continuation that was saved in the previous step in order to resume execution of the generator at that point.
So you can see that
yield is a rather thin wrapper over
shift, just adding a few restrictions:
yieldreturns immediately after capturing the continuation.
- The only way to activate the continuation is to call
- Once you have called
next, you cannot backtrack; the continuation is overwritten.
That last restriction is what is meant by one-shot in Sebastian Markbåge’s One-shot Delimited Continuations with Effect Handlers proposal. A one-shot continuation can be called at most once. Generally, one-shot continuations are easier on the runtime, since the continuation can be overwritten in place.
Compare the simplicity of the implementation of generators using
reset with the implementation of generators in a language without support for delimited continuations, ES5. The generator body has to be divided up by the transpiler into
case blocks at every
yield; flow is then controlled by incrementing some state (
_context.next) to determine which block to execute. Ugh.
yield — analogously to
return for ordinary functions — can only be called inside the top level of the generator body. By contrast,
yield implemented with
shift, can be invoked at any depth of call to capture the entire call chain back up to
shift saves stackful or multi-frame continuations.
3. Effect Handlers
Finally, it is time to take a look at the “effect handlers” part of that proposal. Although they sound complicated, “algebraic effects” are just a small refinement of delimited continuations. First, note that
shift behaves a bit like
throw in that both clear the current delimited continuation (with
throw the delimiter is
try). There are two major ways in which
shift differs from
throw. Namely, with
- The continuation is forcibly discarded instead of reified and made available.
- Control is passed to a handler defined outside of the delimiter instead of to a function defined inline.
Suppose that we had instead decided errors should be handled like this:
This pattern seems a bit obtuse, and yet it is more or less what happens with
shift. This is, I think, one reason
shift at first appears so strange. Since the current continuation is cleared, it makes sense to jump over it and pass control to a handler that is defined outside it. The ‘shift’ in control caused by clearing the current continuation is now reflected by a physical displacement of code:
shift behaved in a similar way? What if, on calling
shift, control jumped to a handler, and the continuation was dealt with there, instead of in-line?
Since with this syntax the handler is no longer in scope, we also need the ability to pass extra data into it. So let us add that, and let us rename
effect. Let us also pass the effect into the function to signify that handlers are bound to their effects:
The result is the language feature called “algebraic effects”. “Algebraic” just means that some laws apply about how these effects compose. The important part is the “effects”. What is an effect? It is nothing other than a custom delimited continuation operator. Instead of calling the generic
shift we call an effect which behaves just like
shift, but embeds some particular user-defined way of handling the continuation. Since we can define handlers, we can encapsulate particular ways of handling delimited continuations and make them easily reusable. For example, we can easily implement async/await using algebraic effects:
In this case, all our handler has to do is pass the continuation to the function thrown into the handler as its callback. Of course we could embed much more specific behaviour if we wished:
Here we have bound the handlers to particular functions. This degree of encapsulation is probably too fine, but illustrates the idea. Andrej Bauer, the creator of a language for algebraic effects, Eff, puts it concisely:
eff : shift/reset = while/if-then-else : goto.
4. Coroutines and fibers
Coroutine is another term that pops up frequently in the context of React Fiber.
So what is a coroutine? There are several answers to this question, depending on the context, but generally speaking a coroutine is a generator with a bit of extra functionality. The following might be referred to as coroutines:
.nextand so by this definition are coroutines.
Bluebird.coroutine,which were async/await implementations based on generators.
- A generator that can yield with a stackful continuation, like
shift. With React Suspense we can pause reconciliation at any depth. So the reference to coroutines in Andrew Clark’s tweet would seem to refer to this idea of being able to resume a deeply nested call frame.
If Suspense is based on coroutines (under the final definition), why is React 16 called React Fiber and not React Coroutine? “Fiber” is frequently used as a synonym of “coroutine”. That may be what Andrew is doing in his tweet, and in most respects the last definition of coroutine given above covers what is meant by a fiber in React. But there is a relevant difference between the two, nicely described by this document. When a coroutine yields, control is passed to the caller and handled by application code. When a fiber yields, control is passed to a scheduler which determines what to run next. Fibers are therefore controlled at the level of the operating system or framework.
A basic example of such a scheduler is the Node.js event loop, which controls the resumption of async/await functions simply by executing whatever finished first. In fact, a Node extension
node-fibers exists which leverages this scheduling to provide stackful yielding with asynchronous value resolution (i.e. deep
await). React’s scheduler is more complicated, dealing with various priority classes for different fibers. Since coroutines also exist as a distinct concept within the React Fiber codebase (see “Coroutines and fibers revisited” below), it seems likely that the distinction between coroutines and fibers is pertinent to React Fiber, and that the importance of the scheduler lies behind the choice of the name “Fiber”.
⚠️ DANGER: SPECULATION AHEAD!
Well done for making it this far. Now it’s time for dessert 🍰🍓: a closer look at how these different concepts manifest themselves in React. A warning: the React source code is not easy to read, is constantly changing, and my thoughts are speculative. Please do get in touch if you can provide any clarifications or corrections.
5. Continuations revisited
As Andrew Clark explains in the react-fiber-architecture repository, a fiber corresponds to a React node and to a virtual stack frame on the heap. A fiber has some input (
pendingProps), some state (a component instance attached to
stateNode), and some output (
effectTag, a side effect to pass to the renderer; the react-fiber-architecture repo is out of date when it refers to an
output field). Each fiber also has what we earlier called a “continuation pointer” under a
return field: it points to the fiber that spawned this fiber.
How exactly are such fibers “resumable”?
One thing we know: no use is made of the native continuations provided by generators. It is therefore impossible to save a continuation and resume at arbitrary points during a render function. With good reason, Fiber refers to components as ‘units of work’ — because in one sense they cannot be split.
So in what sense can they be split?
The key to the puzzle is that because an update on a single fiber typically involves updating children of that fiber as well, the concept of what is actually a “running fiber” is double.
- The fiber on which the update began is a single fiber even though running it involves spawning child fibers all the way down the tree. All those child fibers are part of the single “running fiber” that is being updated.
- Every component in the tree also has an associated fiber, so we can think of the “running fiber” as the specific component being processed at any given instant.
In perspective (2), the “running fiber” corresponds to a unit of work which may run entirely synchronously. But if we adopt perspective (1) and consider the entire recursive reconciliation as the “running fibre”, this fiber can be paused at every call to child fibers, since those are also separate fibers that can be scheduled without being run. The main update then resumes with the child fiber, which either spawns more child fibers or uses the
return pointer to move control back to the parent.
In theory, then, we can pause and resume at any particular fiber. In practice, if a low priority update has been interrupted, some work that was already done may now be invalid: and that means we cannot merrily resume from the point we left off. If the root node has been affected, ‘resumption’ may mean rerunning the entire update from the beginning. React’s internal memoization is therefore an essential aspect of what it means to resume at a certain point. If each step of the work was memoized before the interruption, we can walk fibers without actually running them until either we get back to the original yield point or we find work that needs redoing. In such cases, continuation is quasi-continuation enabled by memoization.
The need to have aborted work be resumable at any point down the tree explains why ES6 generators were not a practical solution to the problem of saving continuations. Apart from the syntax overhead that comes with the shallow yield restriction we discussed above, running the component tree inside a generator would block memoizing each fiber separately, as Sebastian Markbåge explains here. Consequently, a change anywhere in the tree would require rerunning the whole generator. By controlling itself how each frame is called, React can inject the memoization it needs.
That, in broad strokes, is what “continuing” actually means in React Fiber, but what about the continuation itself? Can we pin it down further to a particular entity or abstraction?
Not really 😂.
It is possible to discern at least a couple of different ways in which the notion of a continuation is used in relation to these fibers.
One of the first places continuations were publicly highlighted as important to React was Sebastian Markbåge’s 2016 react-basic repository, an attempt to distil React into a few essential concepts which is anything but basic.
In this example we have a component,
FancyUserList, which renders a couple of nested children, a
UserList inside a
FancyBox. Instead of being called, however, the
UserList component only has some arguments bound to it. The code that actually calls that
UserList component—namely, the call to
box.children — is located at some other unspecified place in the application.
The relevance of this concept of continuation to real React is up for interpretation. It certainly expresses what happens when a fiber is interrupted: a component has some props applied and is then returned and pushed onto a queue to be called elsewhere. But the idea seems broader than that. For example, the pattern also resembles what we do with higher order components. Higher order components take a component, perform some operation on it, such as pass it some props, and return a component for rendering at a later stage. Whether returned to a framework queue or returned within an application—a component is a component... is a continuation.
A more recent instance of the term has rather different connotations.
For Andrew Clark here, “a ‘continuation’ is a callback that is scheduled when yielding execution”. A simplified example follows which makes it clear what this means:
We have a function,
performWork, which processes some queue of tasks in a
while loop. When a deadline has expired, if there are still tasks on the queue, we return
performWork and schedule it for resumption at some later time. So
performWork, in this context, represents the continuation of a queue of tasks: it is a scheduling function rather than some specific component.
So a continuation might be a particular child fiber, whether handled by the framework or the user, or a callback to begin processing that child. It seems that when a component suspends in React Suspense, a continuation is created by applying
Continuation delimiters revisited
In a certain sense React has its continuations delimited by default, since it doesn’t actually have continuations, but only function calls. More meaningfully, we can think of the continuations saved during a particular update as delimited by the fiber on which the update was triggered. That is, when the last child has been processed, the list of side effects will be collected by walking back up the tree to the original fiber, at which point reconciliation “returns” and and the side-effects are passed to the renderer for committing.
A complication of this picture is that users can define extra delimiters in the form of Error Boundaries (any component with
componentDidCatch defined) or Suspense Boundaries ( a
React.Timeout component). These boundaries provide alternative — user-defined — return points for a continuation, which do not return straight to the renderer, but allow further operations. If an error is triggered or a timeout is exceeded, the current continuation is cleared up to the next boundary. That used to mean that any already-processed children of the boundary component were deleted, which would have been a closer approximation to shift-reset behaviour. However, recently, the code was changed in order to allow the children's state to be preserved. Now, the children are not actually unmounted, but instead retained in the tree and visually hidden until they are needed. This adjustment is a good example of how React modifies the language features it finds elsewhere to suit its own purposes.
Upon return to a suspense or error boundary, reconciliation is resumed from the boundary down an alternate path — with a fallback or a spinner — and only completes when we return finally to the originating fiber. This clearing of the continuation and resumption at the boundary occurs in a single render cycle.
With error and suspense boundaries, the code that actually manages the resumption is not located at the point of throwing or fetching, but outside the boundary/delimiter. That brings us to…
6. Effect handlers revisited
A year ago Sebastian Markbåge tweeted a gist labelled “Poor man’s algebraic effects” which provides a model of the technique later used by React Suspense.
A component is able to suspend the fiber it is running in by throwing a promise, which is caught and handled by the framework (note, in the gist, the promise is handled in the throwing function instead of the handler, which is not quite how things are done). When the promise resolves, its value is added to a cache, the fiber is restarted, and the next time the data is requested it is accessed synchronously from cache. This throw-handle-resume pattern resembles the way algebraic effects work, but differs in a few regards:
- First, instead of calling a continuation with some data, the handler saves that data in some shared mutable state (the cache), which the continuation later accesses. This refinement enables the data to be reused by other components or when the component is updated.
- Second, if something else changes while a component is suspended, React will not simply resume at the point of suspense, but will reconcile the tree again, using memoization to skip unnecessary work, and ‘resume’ as close to the suspended component as possible.
- Finally, the idea of algebraic effects as a language feature is that they give users the ability to create their own effects and handlers. We do not have this functionality with React, since we can only raise two kinds of hardcoded effects: data requests and errors. Even as handlers for these effects were being written, however, the idea of a generalised effect system was on the cards. Some thought was given to the possibility that a user might raise different effects by throwing objects with different type signatures:
This code was removed to prevent breaking changes, but we may see generic
Update 29/10/2018: While I was writing this post the React team announced a new feature, Hooks, about which Sebastian has written “conceptually, they are algebraic effects”. At a cursory glance it looks like custom Hooks do in fact give us something like custom algebraic effects. I will try to examine Hooks in a follow-up post: stay tuned.
7. Coroutines revisited
As mentioned above, fibers and coroutines name two distinct entities in the React codebase. Or rather, they did historically: “coroutines” no longer exist. While every component is represented by a
fiber object, coroutines appeared when work on Fiber was first getting going as a specific component type. That was in 2016. The relevant code has changed a lot since then and was recently temporarily(?) removed due to instability. Since coroutines are such a touted concept and may return at some point, it is interesting to look at it nonetheless.
Fibers, recall, are interruptible subroutines whose resumption is controlled by a scheduler. The idea behind coroutines — as opposed to fibers — was to give components explicit control over yielding and resumption. There are certain use-cases for which a parent needs an intermediate result from a child before finalising the props for that child. The main example of such a use-case is layout, where we might need to set position from the parent based on the width of the children. Instead of rendering once, collecting the value, and then rendering again, coroutines would allow us to generate the correct values in a single render-cycle. The following code is based on an example tweeted by Dan Abramov:
I have used an earlier API in the code above to highlight the conceptual debt to coroutines. In more recent iterations,
createYield were renamed
createReturn. Check out the original tweet for the modified syntax.
So how does it work? The second argument to
createCoroutine receives the result from
createYield and uses it to render the continuation component. The flow is a bit difficult to follow, but hopefully the following animation will help:
The reasons for the abandonment of the coroutine terminology are discussed in this pull request. In the original design, a continuation component returned in a
createYield would inherit the state of the yielding component. This proved to be problematic for reasons that escape me (please comment if you can shed light!). What is clear though, is that if the continuation component has no access to the state of the original call frame, it cannot conceptually be the continuation of that call frame. Rather, it is an ordinary callback.
From my haphazard attempts to decipher the source code, it looks like in both implementations coroutines were represented by multiple components/fibers, one for each stage: parent call, child return, and (pseudo-)continuation render. The implementation mirrored the technique from
regenerator we saw earlier: split the function at yield points, loop, and manage the transition using some state. In React, that state was the fiber’s
tag attribute, which stores the ‘type of work’ represented by fiber, either:
CoroutineComponent (parent call),
YieldComponent (child return) or
CoroutineHandlerPhase (continuation render).
It will be fascinating to see what form coroutines take when they return to React Fiber, if they do.
You made it this far?
… then I expect you are wanting some kind of conclusion 😛.
In the end, React does not exactly implement any of the language features this post is about, but blends the ideas behind them into something original which more precisely suits its purposes. This is why Sebastian Markbåge has said:
React is operating at the level of a language feature
In particular, React strives at every turn to insert ways of storing and reusing work that go beyond the functionality normally associated with continuations, coroutines, fibers and effects. It is also worth highlighting again that you can use the React API perfectly well without understanding what is going on under the hood. To fully understand how React achieves the functionality that it exposes, however, we should understand the ideas that inspired it.
Thanks for reading. I hope I have shed some light on these difficult topics, but this is just the beginning…many things are still unclear to me and React is evolving fast. Get in touch on Twitter 🐦. If you liked it, clap 👏 , share, and, hey, why not check out my other posts as well?
For more on React Fiber and the language features discussed in this blog post I recommend the following:
- Didact Fiber by Rodrigo Pombo: how to implement React Fiber (well, a miniature version) yourself.
- A React Fiber debugger by Dan Abramov: allows you to visualise which functions are called as reconciliation proceeds.
- Two great repositories by Shawn Wang and Toru Kobayashi: both chock full of resources on React Fiber.