Solving the structured control flow problem once and for all

Yuri Iozzelli
Apr 11, 2019 · 12 min read

Introduction

I work on Cheerp, a C++ to JavaScript and WebAssembly compiler based on LLVM.

JavaScript vs WebAssembly control flow

From the point of view of this article, JavaScript’s and WebAssembly’s control flow constructs are equivalent.

Image for post
Image for post
Equivalence of JavaScript and WebAssembly constructs
  • break and continue are replaced in WebAssembly by a single instruction br(for branch). Depending on the enclosing scope it is referring to, br has either the semantic of break or the one of continue . While in JavaScript there is an optional label argument, in WebAssembly there is a mandatory depth argument, which indicates which enclosing scope the br instruction is referring to.
  • loop instructions in WebAssembly are equivalent to an infinite while(1) loop in JavaScript, but with an implicit break at the end. Branch instructions targeting a loop jump to its beginning (acting like a continue).
  • block instructions in WebAssembly are equivalent to a JavaScript do..while(0) loop. Branch instructions targeting a block jump to its end (acting like a break ).

A universal but inefficient solution

According to the structured program theorem, we only need a loop and a branching construct in order to solve the problem (so we don’t even need break or continue statements!), but at the cost of using an additional helper variable (we will call it label) that encodes the control flow state.


Image for post
Image for post
Example CFG
let label='a';
while(1) {
switch(label) {
case 'a':
A();
if (A_to_B()) {
label='b';
} else {
label='c';
}
break;
case 'b':
B();
label='d';
break;
case 'c':
C();
label='e';
break;
case 'd':
D();
if (D_to_B()) {
label='b';
} else {
label='e';
}
break;
case 'e':
E();
break;
}
}

A();
if (A_to_B()) {
while(1) {
B();
D();
if (D_to_E()) {
break;
}
}
} else {
C();
}
E();

Relooper, and why it is not good enough

In the specific context of LLVM-to-JavaScript/WebAssembly compilers, the first algorithm that solved the structured control flow problem was Emscripten’ s Relooper. Cheerp itself used Relooper before version 2.0, but we replaced it for reasons I am going to explain shortly.


Image for post
Image for post
Confusing CFG for Relooper
switch (A())
{
case 0:
B();
case 1:
C();
case 2:
D();
default:
E();
}

var label=0;
switch(A()){
case 0:
{
B();
label=3;
break;
}
case 1:
{
label=3;
break;
}
case 2:
{
label=4;
break;
}
}
if(label===3){
C();
label=4;
}
if(label===4){
D();
}
E();
d:do{
e:do{
switch(A()){
case 0:
B();
break;
case 1:
break;
case 2:
break d;
default:
break e;
}
C();
}while(0);
D();
}while(0);
E();

Looking for prior art

As it often turns out in computer science, something very similar to our problem was thoroughly investigated and solved back in the ’70s: at the time people were arguing about the use and misuse of goto statements in programs, and proposing alternative constructs less powerful but sufficient to convert programs using gotos into a structured control flow form.

  • For all back edges (A, B), node B dominates node A.

The Stackifier algorithm

Putting aside irreducible CFG for a moment, how can we compile any reducible CFG into JavaScript without using the label variable?

  • We choose any topological ordering (in general, there are multiple valid ones), with the constraint that once a loop starts all the subsequent blocks must be dominated by the loop header, until all the loop blocks have appeared.
  • We enclose each loop in a loop scope (a while(1){/*...*/} in JavaScript and a loop...end in WebAssembly). All the back edges become continue statements (possibly labeled in case of nested scopes).
  • Now we are left with the forward edges. If the source and destination blocks are consecutive in the topological order, we don’t need to do anything.
  • Otherwise, we need to place a block scope (a do{/*...*/}while(0) in JavaScript, and a block...end in WebAssembly), such that the destination block is just after the end of the scope. In principle we can put all the scope openings at the beginning of the function, but in practice JavaScript engines don’t like deeply nested scopes and throw an error beyond a certain limit (around 1000 for a script with just empty nested scopes, less in real code). We place the opening just outside of the outermost scope that closes between the break and the end of the scope.
// NOT ALLOWED
while(1){
do{
}
while(0);
// OK
while(1){
do{
}while(0);
}

Image for post
Image for post
Example CFG stackified
block2:do{
do{
A();
if (A_to_C()) {
break;
}
while(1) {
B();
D();
if (D_to_B()) {
continue;
} else {
break block2;
}
}
}while(0);
C();
}while(0);
E();


Image for post
Image for post
Confusing CFG for Relooper stackified
d:do{
e:do{
c:do{
switch(A()){
case 0:
break;
case 1:
break c;
case 2:
break d;
default:
break e;
}
B();
}while(0);
C();
}while(0);
D();
}while(0);
E();

Improving Stackifier: adding if/else nesting

One nice feature of Relooper is that the control flow that it produces is (most of the time) quite readable and compact.

  • Then, in the rendering phase, if a block has only one forward predecessor, we nest it directly inside the corresponding branching statement of the predecessor. All the blocks dominated by this block will also end up inside of the branching statement.

Image for post
Image for post
Example CFG stackified, new version
A();
if (A_to_B()) {
while(1) {
B();
D();
if (D_to_B()) {
continue;
} else {
break;
}
}
} else {
C();
}
E();

Fix the irreducible control flow

In order to handle an irreducible CFG, we first need to somehow turn it into a reducible one.

  • SCCs with more than one element are loops
  • We identify all the loop headers, which are the nodes reachable from outside of the SCC
  • If there is more than one, we have a multiple-entry loop

Image for post
Image for post
Irreducible CFG on the left, fixed on the right

Performance Improvements

By limiting the use of the label variable, Stackifier produces better code than Relooper, but the actual performance gains for a whole program can vary widely, depending on if and where Relooper generated sub-optimal control flow.


Image for post
Image for post
arch linux emscriptenbenchmarks

Acknowledgments

The Stackifier algorithm has been first implemented in the WebAssembly backend of upstream LLVM. That implementation has been an important source of inspiration for Cheerp’s version (and its lack of documentation a source of motivation for writing this article). The if/else nesting strategy though is unique to Cheerp, as far as I know.

leaningtech

Leaning Technologies' Blog - everything Cheerp, CheerpJ…

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch

Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore

Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store