Understanding how browsers work by creating one

Published in

jobpal has been acquired by SmartRecruiters!

35 min readAug 28, 2020

In this article, I will demonstrate how browsers work by creating a simple mini-browser. This toy example can show the basic rendering principles. After all, learning by doing is always effective, so let’s start the journey by building from scratch a browser that supports HTTP requests, HTML/CSS parsing, and flexbox layout. If you want to view the complete code, please refer to this repository.

A browser’s basic work process

To display a page, the browser takes the following steps:

Requesting the page from the server using HTTP or HTTPS protocol
Parsing the HTML to build a DOM tree
Parsing the CSS to build a CSSOM tree
Combine the DOM and CSSOM into a render tree
Rendering the elements according to the CSS properties to get the bitmaps in memory
Compositing bitmap for better performance (optional)
Painting to display the content on the screen

The contents can be displayed on the screen as soon as they are individually ready, because the rendering engine will not wait until all HTML is parsed before starting to build and layout the DOM tree. In this way, the contents can be displayed on the screen gradually as soon as possible.

In our mini-browser project, we follow the steps below to implement it.

.
├── client
│   ├── index.js  (step one)
│   ├── parser.js (step two and three)
│   ├── layout.js (step four)
│   ├── render.js (step five)
│   ├── package-lock.json
│   ├── package.json
│   └── viewport.jpg (the final result)
└── server
    ├── package-lock.json
    ├── package.json
    └── server.js

Step one: network request

Let’s start with this basic directory structure:

.
├── client
│ └── index.js
└── server
 └── server.js

Set up a server

First of all, let’s start a simple HTTP server which can send back a simple sentence “ Hello World\n”.

./server/server.js

const http = require("http");http.createServer((request, response) => {
  let body = [];  request.on('error', error => {
    console.log(error);
  })
    .on('data', chunk => {
      body.push(chunk.toString());
      console.log(body)
    })
    .on('end', () => {
      body = body.join('');
      response.writeHead(200, { 'Content-Type': 'text/html' });
      response.end(' Hello World\n');
    });
}).listen(8080);console.log('The server is running!');

After installing the ‘http’ library and run the index.js file, the server is running.

npm install http
node server.js

I would recommend using the node monitor, so you don’t have to restart the server by hand every time you change something.

nodemon server.js

Construct and send an HTTP request

To send an HTTP request, we need to

specify the host for IP
specify the port for TCP
specify the method, path, headers and body for HTTP

We put them in an IIFE (Immediately Invoked Function Expression) like this:

./client/index.js

void async function () {
  const request = new Request({
    host: '127.0.0.1',
    port: 8080,
    method: 'POST',
    path: '/',
    headers: {
      ['X-Jobpal']: 'chatbots',
    },
    body: {
      name: 'Jobpal',
    }
  });  const response = await request.send();
  console.log(response);
}()

According to these requirements, we can implement the request class as follows.

./client/index.js

const net = require('net');class Request {
  constructor({
    method,
    host,
    port,
    path,
    headers,
    body
  }) {
    this.host = host;
    this.port = port || 80;
    this.method = method || 'GET';
    this.path = path || '/';
    this.headers = headers || {};
    this.body = body || {};    if (!this.headers['Content-Type']) {
      this.headers['Content-Type'] = 'application/x-www-form-urlencoded';
    }
    // encode the http body.
    if (this.headers['Content-Type'] === 'application/json') {
      this.bodyText = JSON.stringify(this.body);
    } else if (this.headers['Content-Type'] === 'application/x-www-form-urlencoded') {
      this.bodyText = Object.keys(this.body).map(
        key => `${key}=${encodeURIComponent(this.body[key])}`
      ).join('&');
    }    this.headers["Content-Length"] = this.bodyText.length;
  }
  // The parameter is a TCP connection.
  send(connection) {
    return new Promise((resolve, reject) => {
      if (connection) {
        connection.write(this.toString());
      } else {
        // create a TCP connection
        connection = net.createConnection(
          { host: this.host, port: this.port },
          () => connection.write(this.toString())
        );
      }      connection.on('data', data => {
        console.log(data.toString());
        console.log(JSON.stringify(data.toString()));
      });      connection.on('error', error => {
        console.log(error);
        reject(error);
        connection.end();
      })
    })
  }  toString() {
    // the HTTP request
    return `${this.method} ${this.path} HTTP/1.1\r
${Object.keys(this.headers).map(key => `${key}: ${this.headers[key]}`).join('\r\n')}\r
\r
${this.bodyText}`;
  }
}

In the construction function, we save the parameters and set the default values. When dealing with headers, pay attention to content-type. It is required; otherwise, the body can not be parsed correctly, because different formats require different parsing methods.

These are the four most used format:

application/json
application/x-www-form-urlencoded (when submitting form)
multipart/form-data (when uploading files)
text/xml

In the project, we set the default value as application/x-www-form-urlencoded. In this format, the submitted data will be encoded like key1=value1&key2=value2.

In toString() method, the HTTP request is build. It is composed of a request line, request head, and request body.

Then, in the send() method, we send this request and print the response data when success.

Don’t forget to install the ‘net’ library, we need it to create a TCP connection.

As we can see, the structure of response data conforms to the above figure.

Maybe you have noticed that the response body is different from what we sent on the server-side.

what we sent: ‘ Hello World\n’
what we get: ‘d\r\n Hello World\n\r\n0\r\n\r\n’

The reason is the Transfer-Encoding: chunked in the response headers.

As described here:

chunked
Data is sent in a series of chunks. The Content-Length header is omitted in this case and at the beginning of each chunk you need to add the length of the current chunk in hexadecimal format, followed by \r\n and then the chunk itself, followed by another \r\n. The terminating chunk is a regular chunk, with the exception that its length is zero. It is followed by the trailer, which consists of a (possibly empty) sequence of entity header fields.

So in the first chunk, d is a hexadecimal number, indicating that there are 13 characters, which are ‘HelloWorld’, two spaces, and a \n. The second chunk, the empty one, means the body part is over.

We have successfully got the response data. Next, let’s parse it.

Parse HTTP response

Parse response line and response head

As you can see, the response is such a string:

the request line and the request header are separated by \r\n
there are two \r\n between the last request header and the request body

HTTP/1.1 200 OK\r\nContent-Type: text/html\r\nDate: Sun, 02 Aug 2020 17:05:31 GMT\r\nConnection: keep-alive\r\nTransfer-Encoding: chunked\r\n\r\nd\r\n Hello World\n\r\n0\r\n\r\n

We can use a state machine to parse this string. The states can be designed according to the format of the response, like the code below.

In the index.js, we add the ResponseParser class.

./client/index.js

class ResponseParser {
  constructor() {
    this.WAITING_STATUS_LINE = 0;
    this.WAITING_STATUS_LINE_END = 1;    this.WAITING_HEADER_NAME = 2;
    this.WAITING_HEADER_SPACE = 3;
    this.WAITING_HEADER_VALUE = 4;
    this.WAITING_HEADER_LINE_END = 5;
    this.WAITING_HEADER_BLOCK_END = 6;    this.WAITING_BODY = 7;    // initial state
    this.current = this.WAITING_STATUS_LINE;    this.statusLine = '';
    this.headers = {};
    this.headerName = '';
    this.headerValue = '';
    this.bodyParser = null;
  }
  get isFinished() { }  get response() { }  receive(string) {
    for (const char of string) {
      this.receiveChar(char);
    }
    console.log(this.statusLine, '\n', this.headers)
  }
  
  // a state machine
  receiveChar(char) {
    switch (this.current) {
      case this.WAITING_STATUS_LINE:
        if (char === '\r') {
          this.current = this.WAITING_STATUS_LINE_END;
        } else {
          this.statusLine += char;
        }
        break;      case this.WAITING_STATUS_LINE_END:
        if (char === '\n') {
          this.current = this.WAITING_HEADER_NAME;
        }
        break;      case this.WAITING_HEADER_NAME:
        if (char === ':') {
          this.current = this.WAITING_HEADER_SPACE;
        } else if (char === '\r') {
          // no key-value pairs at all or the end of k-v paris (no more)
          this.current = this.WAITING_HEADER_BLOCK_END;
        } else {
          this.headerName += char;
        }
        break;      case this.WAITING_HEADER_SPACE:
        if (char === ' ') {
          this.current = this.WAITING_HEADER_VALUE;
        }
        break;
      case this.WAITING_HEADER_VALUE:
        if (char === '\r') {
          this.current = this.WAITING_HEADER_LINE_END;
        } else {
          this.headerValue += char;
        }
        break;      case this.WAITING_HEADER_LINE_END:
        if (char === '\n') {
          this.current = this.WAITING_HEADER_NAME;
          this.headers = {
            ...this.headers,
            [this.headerName]: this.headerValue
          }
          this.headerName = '';
          this.headerValue = '';
        }
        break;      case this.WAITING_HEADER_BLOCK_END:
        if (char === '\n') {
          this.current = this.WAITING_BODY;
        }
        break;      case this.WAITING_BODY:
        console.log(JSON.stringify(char));
        break;      default:
        break;
    }
  }
}

Then, in the send() method of the Request class, we pass the data to the parser.

send(connection) {
    return new Promise((resolve, reject) => {
      const parser = new ResponseParser;      if (connection) {
        connection.write(this.toString());
      } else {
        // create a TCP connection
        connection = net.createConnection(
          { host: this.host, port: this.port },
          () => connection.write(this.toString())
        );
      }      connection.on('data', data => {
        console.log(data.toString());
        console.log(JSON.stringify(data.toString()));        // receive the data and pass it to the parser
        parser.receive(data.toString());
        if (parser.isFinished) {
          resolve(parser.response);
          connection.end();
        }
      });      connection.on('error', error => {
        console.log(error);
        reject(error);
        connection.end();
      })
    })
  }

Executing the file, we can see that the response line and response headers have been parsed successfully, and the content of the response body has been printed.

Different types of response bodies require different parsing methods. Next, we only take chunked as an example.

Parse response body

In the newly added ChunkedBodyParser class, we also use a state machine to parse the response body.

The hexadecimal number and the content are separated by \r\n, so we design the states like this.

./client/index.js

class ChunkedBodyParser {
  constructor() {
    this.WAITING_LENGTH = 0;
    this.WAITING_LENGTH_LINE_END = 1;    this.READING_CHUNK = 2;
    this.WAITING_BLANK_LINE = 3;
    this.WAITING_BLANK_LINE_END = 4;    this.current = this.WAITING_LENGTH;    this.content = [];
    this.length = 0;
    this.isFinished = false;
  }receiveChar(char) {
    switch (this.current) {
      case this.WAITING_LENGTH:
        if (char === '\r') {
          this.current = this.WAITING_LENGTH_LINE_END;
          if (this.length === 0) {
            this.isFinished = true;
          }
        } else {
          // operation of hexadecimal numbers, for example, 2AF5, 2 => A => F => 5
          this.length *= 16;
          this.length += parseInt(char, 16);
        }
        break;      case this.WAITING_LENGTH_LINE_END:
        if (char === '\n' && !this.isFinished) {
          this.current = this.READING_CHUNK;
        }
        break;      case this.READING_CHUNK:
        this.content.push(char);
        this.length -= 1;
        if (this.length === 0) {
          this.current = this.WAITING_BLANK_LINE;
        }
        break;      case this.WAITING_BLANK_LINE:
        if (char === '\r') {
          this.current = this.WAITING_BLANK_LINE_END;
        }
        break;      case this.WAITING_BLANK_LINE_END:
        if (char === '\n') {
          this.current = this.WAITING_LENGTH;
        }
        break;      default:
        break;
    }
  }
}

Then, we use the parser in the receiveChar() method of the ResponseParser class.

receiveChar(char) {
  ...
  case this.WAITING_HEADER_NAME:
        if (char === ':') {
          this.current = this.WAITING_HEADER_SPACE;
        } else if (char === '\r') {
          this.current = this.WAITING_HEADER_BLOCK_END;
          if (this.headers['Transfer-Encoding'] === 'chunked') {
            this.bodyParser = new ChunkedBodyParser();
          }
        } else {
          this.headerName += char;
        }
        break;
  ...
  case this.WAITING_BODY:
        this.bodyParser.receiveChar(char);
        break;
  ...
}

At this point, the response line, response header, and response body are all parsed, we can complete the isFinished() and response() methods in ResponseParser class.

get isFinished() {
  return this.bodyParser && this.bodyParser.isFinished;
}
get response() {
  this.statusLine.match(/HTTP\/1.1 ([0-9]+) ([\s\S]+)/);
  return {
    statusCode: RegExp.$1,
    statusText: RegExp.$2,
    headers: this.headers,
    body: this.bodyParser.content.join('')
  }
}

This is the printed result.

Let’s send HTML from the server. After saving the changes, the node monitor should automatically restart the server. Requesting again from the client-side, the response is printed out.

./server/server.js

const http = require("http");http.createServer((request, response) => {
  let body = [];  request.on('error', error => {
    console.log(error);
  })
    .on('data', chunk => {
      body.push(chunk.toString());
      console.log(body)
    })
    .on('end', () => {
      body = body.join('');
      response.writeHead(200, { 'Content-Type': 'text/html' });
      response.end(
        `<html maaa=a >
    <head>
          <style>
    body div #myid{
      width:100px;
      background-color: #ff5000;
    }
    body div img{
      width:30px;
      background-color: #ff1111;
    }
      </style>
    </head>
    <body>
      <div>
          <img id="myid"/>
          <img />
      </div>
    </body>
    </html>`);
    });
}).listen(8080);console.log('The server is running!');

So far, we have successfully received the HTML from the network request.

Now, we can try to parse HTML to get a DOM tree.

Step two: HTML parsing

Tokenization (lexical analysis)

A token represents the smallest meaningful unit in the compilation principle. In terms of HTML, 90% of the token that we need for daily development is only about startTag, attributes, endTag, comments, and CDATA nodes.

Take <p class="a">text</p> as an example, split it to the smallest meaningful units, we can get:

<p the start of an opening tag
class="a" attribute
> the end of an opening tag
text
closing tag

Similar to the previous, we also use a state machine for parsing. Fortunately, the state machine has been well designed in the HTML standard, what we need to do is just translate it to JavaScript.

There are more than 80 states specified, but we don’t need so many states to parse our simple HTML. So in our sample code, some simplifications are made.

Setup an empty state machine

We design the state as a function so that the state transition part is very simple.

let state = data; // the initial state is called 'data' in the HTML standardfor (const char of html) {
  state = state(char);
}

Add a new file “parser.js”.

./client/parser.js

const EOF = Symbol('EOF'); // end of file tokenmodule.exports.parseHTML = function (html) {
  let state = data;  for (const char of html) {
    console.log(JSON.stringify(char), state.name)
    state = state(char);
  }
  state = state(EOF);
}// There are three kinds of HTML tags: opening tag <div>, closing tag </div>, self-closing tag <div/>// the initial state is called 'data' in the HTML standard
function data(char) {
  if (char === '<') {
    return tagOpen;
  } else if (char === EOF) {
    return;
  } else {
    // text node
    return data;
  }
}// when the state is tagOpen, we don't know what kind of tag it is yet. like '<'
function tagOpen(char) {
  if (char === '/') {
    // '</' ,e.g. '</div>'
    return endTagOpen;
  } else if (char.match(/^[a-zA-Z]$/)) {
    // the char is a letter, the tag could be a opening tag or a self-closing tag. e.g. '<d'  => '<div>' or '</div>'
    return tagName(char); // reconsume the char
  } else {
    // Parse error
    return;
  }
}

function endTagOpen(char) {
  if (char.match(/^[a-zA-Z]$/)) {
    return tagName(char);
  } else if (char === '>') {
    // error  />  It's html, not JSX
    // Parse error
  } else if (char === EOF) {
    // Parse error
  } else {
    // Parse error
  }
}function tagName(char) {
  if (char.match(/^[\t\n\f ]$/)) {
    // tagname start from a '<', end with a ' ' e.g. '<div prop'
    return beforeAttributeName;
  } else if (char === '/') {
    return selfClosingStartTag;
  } else if (char.match(/^[a-zA-Z]$/)) {
    return tagName;
  } else if (char === '>') {
    // the current tag is over, go back to the initial state to parse the next tag
    return data;
  } else {
    return tagName;
  }
}// Only '>' is valid after '<div/'
function selfClosingStartTag(char) {
  if (char === ">") {
    return data;
  } else if (char === EOF) {  } else {  }
}function beforeAttributeName(char) {
  if (char.match(/^[\t\n\f ]$/)) {
    return beforeAttributeName;
  } else if (char === "/" || char === ">" || char === EOF) {
    return afterAttributeName(char);
  } else if (char === "=") {
    // Parse error
  } else {
    return attributeName(char);
  }
}function attributeName(char) {
  if (char.match(/^[\t\n\f ]$/) || char === "/" || char === EOF) { // end of a pair of attribute, e.g. "<div class='abc' "
    return afterAttributeName(char);
  } else if (char === "=") { // e.g. 'class='
    return beforeAttributeValue;
  } else if (char === "\u0000") { // null  } else if (char === "\"" || char === "'" || char === "<") {  } else {
    return attributeName;
  }
}
function beforeAttributeValue(char) {
  if (char.match(/^[\t\n\f ]$/) || char === "/" || char === ">" || char === EOF) {
    return beforeAttributeValue;
  } else if (char === "\"") {
    return doubleQuotedAttributeValue; // e.g. <html attribute="
  } else if (char === "\'") {
    return singleQuotedAttributeValue; // e.g. <html attribute='
  } else if (char === ">") {
    // return data
  } else {
    return UnquotedAttributeValue(char); // e.g. <html attribute=
  }
}function doubleQuotedAttributeValue(char) {
  if (char === "\"") { // the second double quotes
    return afterQuotedAttributeValue;
  } else if (char === "\u0000") {  } else if (char === EOF) {  } else {
    return doubleQuotedAttributeValue;
  }
}function singleQuotedAttributeValue(char) {
  if (char === "\'") {
    return afterQuotedAttributeValue;
  } else if (char === "\u0000") {  } else if (char === EOF) {  } else {
    return singleQuotedAttributeValue;
  }
}// when the attribute part is over, save attribute name and value to the current token
function UnquotedAttributeValue(char) {
  if (char.match(/^[\t\n\f ]$/)) { // end of unquoted Attribute value
    return beforeAttributeName; // to parse a new pair of attribute
  } else if (char === "/") {
    return selfClosingStartTag;
  } else if (char === ">") {
    return data;
  } else if (char === "\u0000") {  } else if (char === "\"" || char === "'" || char === "<" || char === "=" || char === "`") {  } else if (char === EOF) {  } else {
    return UnquotedAttributeValue;
  }
}function afterQuotedAttributeValue(char) {
  if (char.match(/^[\t\n\f ]$/)) {
    return beforeAttributeName;
  } else if (char === "/") {
    return selfClosingStartTag;
  } else if (char === ">") {
    return data;
  } else if (char === EOF) {  } else {
    return doubleQuotedAttributeValue;
  }
}function afterAttributeName(char) {
  if (char.match(/^[\t\n\f ]$/)) {
    return afterAttributeName;
  } else if (char === "/") {
    return selfClosingStartTag;
  } else if (char === "=") {
    return beforeAttributeValue;
  } else if (char === ">") {
    return data;
  } else if (char === EOF) {  } else {
    return attributeName(char);
  }
}

Then, use the new parser in the index.js.

./client/index.js

const parser = require('./parser');
...void async function () {
  const request = new Request({
   ...
  });  const response = await request.send();
  const dom = parser.parseHTML(response.body);
  console.log(JSON.stringify(dom, null, 2));
}()

Run the index.js, now the state machine is running! You can see the correspondence between characters and states at each step in the printed log.

The dom is undefined because now this state machine is empty, it does nothing. In the next step, let's add the corresponding logic code in these states to build the DOM tree.

Emit tokens

The tag types are nothing more than the starting-tag, the closing tag, and the self-closing tag. In the state machine, we emit the tag token when we encounter the end of a tag.

The emit() function takes the token generated from the state machine. In this step, it does nothing but prints the received token.

So the parser.js looks like this.

./client/parser.js

const EOF = Symbol('EOF');
let currentToken = null;function emit(token) {
  console.log(token);
}module.exports.parseHTML = function (html) {
  let state = data;  for (const char of html) {
    state = state(char);
  }
  state = state(EOF);
}function data(char) {
  if (char === '<') {
    return tagOpen;
  } else if (char === EOF) {
    emit({ type: 'EOF' });
    return;
  } else {
    // emit the text nodes one by one, we can join them later
    emit({
      type: 'text',
      content: char,
    });
    return data;
  }
}function tagOpen(char) {
  if (char === '/') {
    return endTagOpen;
  } else if (char.match(/^[a-zA-Z]$/)) {
    currentToken = {
      type: 'startTag',
      tagName: '',
    }
    return tagName(char); // reconsume
  } else {
    // Parse error
    return;
  }
}

function endTagOpen(char) {
  if (char.match(/^[a-zA-Z]$/)) {
    currentToken = {
      type: 'endTag',
      tagName: ''
    }
    return tagName(char);
  } else if (char === '>') {
    // Parse error
  } else if (char === EOF) {
    // Parse error
  } else {
    // Parse error
  }
}function tagName(char) {
  if (char.match(/^[\t\n\f ]$/)) {
    return beforeAttributeName;
  } else if (char === '/') {
    return selfClosingStartTag;
  } else if (char.match(/^[a-zA-Z]$/)) {
    currentToken.tagName += char;
    return tagName;
  } else if (char === '>') {
    // the current tag is over, go back to the initial state to parse the next tag
    emit(currentToken);
    return data;
  } else {
    return tagName;
  }
}function selfClosingStartTag(char) {
  if (char === ">") {
    currentToken.isSelfClosing = true;
    emit(currentToken);
    return data;
  } else if (char === EOF) {  } else {  }
}function beforeAttributeName(char) {
  if (char.match(/^[\t\n\f ]$/)) {
    return beforeAttributeName;
  } else if (char === "/" || char === ">" || char === EOF) {
    return afterAttributeName(char);
  } else if (char === "=") {
    // parser error
  } else {
    return attributeName(char);
  }
}function attributeName(char) {
  if (char.match(/^[\t\n\f ]$/) || char === "/" || char === EOF) {
    return afterAttributeName(char);
  } else if (char === "=") {
    return beforeAttributeValue;
  } else if (char === "\u0000") { // null  } else if (char === "\"" || char === "'" || char === "<") {  } else {
    return attributeName;
  }
}function beforeAttributeValue(char) {
  if (char.match(/^[\t\n\f ]$/) || char === "/" || char === ">" || char === EOF) {
    return beforeAttributeValue;
  } else if (char === "\"") {
    return doubleQuotedAttributeValue;
  } else if (char === "\'") {
    return singleQuotedAttributeValue;
  } else if (char === ">") {
    // return data
  } else {
    return UnquotedAttributeValue(char);
  }
}function doubleQuotedAttributeValue(char) {
  if (char === "\"") {
    return afterQuotedAttributeValue;
  } else if (char === "\u0000") {  } else if (char === EOF) {  } else {
    return doubleQuotedAttributeValue;
  }
}function singleQuotedAttributeValue(char) {
  if (char === "\'") {
    return afterQuotedAttributeValue;
  } else if (char === "\u0000") {  } else if (char === EOF) {  } else {
    return singleQuotedAttributeValue;
  }
}function UnquotedAttributeValue(char) {
  if (char.match(/^[\t\n\f ]$/)) {
    return beforeAttributeName;
  } else if (char === "/") {
    return selfClosingStartTag;
  } else if (char === ">") {
    emit(currentToken);
    return data;
  } else if (char === "\u0000") {  } else if (char === "\"" || char === "'" || char === "<" || char === "=" || char === "`") {  } else if (char === EOF) {  } else {
    return UnquotedAttributeValue;
  }
}function afterQuotedAttributeValue(char) {
  if (char.match(/^[\t\n\f ]$/)) {
    return beforeAttributeName;
  } else if (char === "/") {
    return selfClosingStartTag;
  } else if (char === ">") {
    emit(currentToken);
    return data;
  } else if (char === EOF) {  } else {
    return doubleQuotedAttributeValue;
  }
}function afterAttributeName(char) {
  if (char.match(/^[\t\n\f ]$/)) {
    return afterAttributeName;
  } else if (char === "/") {
    return selfClosingStartTag;
  } else if (char === "=") {
    return beforeAttributeValue;
  } else if (char === ">") {
    emit(currentToken);
    return data;
  } else if (char === EOF) {  } else {
    return attributeName(char);
  }
}

Now, we get these tokens. You might have noticed that the attributes are missing. For example, the token of the img tag with id "myid" is { type: 'startTag', tagName: 'img', isSelfClosing: true }. The id is not included.

The way of adding attributes is similar to the previous step. Add the currentAttribute variable and complete the logic in the state machine like below.

./client/parser.js

const EOF = Symbol('EOF');
let currentToken = null;
let currentAttribute = null;...function beforeAttributeName(char) {
  if (char.match(/^[\t\n\f ]$/)) {
    return beforeAttributeName;
  } else if (char === "/" || char === ">" || char === EOF) {
    return afterAttributeName(char);
  } else if (char === "=") {
    // parse error
  } else {
    return attributeName(char);
  }
}function attributeName(char) {
  if (char.match(/^[\t\n\f ]$/) || char === "/" || char === EOF) {
    return afterAttributeName(char);
  } else if (char === "=") {
    return beforeAttributeValue;
  } else if (char === "\u0000") {  } else if (char === "\"" || char === "'" || char === "<") {   } else {
    return attributeName;
  }
}
function attributeName(char) {
  if (char.match(/^[\t\n\f ]$/) || char === "/" || char === EOF) {
    return afterAttributeName(char);
  } else if (char === "=") {
    return beforeAttributeValue;
  } else if (char === "\u0000") { // null  } else if (char === "\"" || char === "'" || char === "<") {  } else {
    return attributeName;
  }
}function beforeAttributeValue(char) {
  if (char.match(/^[\t\n\f ]$/) || char === "/" || char === ">" || char === EOF) {
    return beforeAttributeValue;
  } else if (char === "\"") {
    return doubleQuotedAttributeValue;
  } else if (char === "\'") {
    return singleQuotedAttributeValue;
  } else if (char === ">") {
    // return data
  } else {
    return UnquotedAttributeValue(char);
  }
}function doubleQuotedAttributeValue(char) {
  if (char === "\"") {
    return afterQuotedAttributeValue;
  } else if (char === "\u0000") {  } else if (char === EOF) {  } else {
    return doubleQuotedAttributeValue;
  }
}function singleQuotedAttributeValue(char) {
  if (char === "\'") {
    return afterQuotedAttributeValue;
  } else if (char === "\u0000") {  } else if (char === EOF) {  } else {
    return singleQuotedAttributeValue;
  }
}function UnquotedAttributeValue(char) {
  if (char.match(/^[\t\n\f ]$/)) {
    return beforeAttributeName;
  } else if (char === "/") {
    return selfClosingStartTag;
  } else if (char === ">") {
    emit(currentToken);
    return data;
  } else if (char === "\u0000") {  } else if (char === "\"" || char === "'" || char === "<" || char === "=" || char === "`") {  } else if (char === EOF) {  } else {
    return UnquotedAttributeValue;
  }
}function afterQuotedAttributeValue(char) {
  if (char.match(/^[\t\n\f ]$/)) {
    return beforeAttributeName;
  } else if (char === "/") {
    return selfClosingStartTag;
  } else if (char === ">") {
    emit(currentToken);
    return data;
  } else if (char === EOF) {  } else {
    return doubleQuotedAttributeValue;
  }
}function afterAttributeName(char) {
  if (char.match(/^[\t\n\f ]$/)) {
    return afterAttributeName;
  } else if (char === "/") {
    return selfClosingStartTag;
  } else if (char === "=") {
    return beforeAttributeValue;
  } else if (char === ">") {
    emit(currentToken);
    return data;
  } else if (char === EOF) {  } else {
    return attributeName(char);
  }
}

Run the index again, you can find out that now the attributes come back!

At this point, we have split the character stream into tokens, and then let’s use these tokens to build a DOM tree.

DOM tree construction (syntactic analysis)

In real browsers, the HTML nodes inheritance from different subclasses of Node. To simplify our code, we only divide Node into Element and Text.

For elements, we construct the DOM tree by using a stack to match the tags.

Specifically, when the emit function receives the token, it starts to build the DOM tree. When it encounters a startTag, it pushes the element into the stack, and when it encounters the matching endTag, it pops the top element out of the stack. Self-closing tags do not need to be pushed into the stack.

It is worth mentioning that when a tag mismatch occurs, the real browser will do fault-tolerant processing, while we just throw an error here.

When all tokens are received, the top element (the document element) of the stack is the root node of the DOM tree.

For Text nodes, we need to merge them when they are adjacent. When it is pushed into the stack, check whether the top node of the stack is a Text node. If so, merge the Text nodes and then add the text nodes to the DOM tree.

./client/parser.js

const EOF = Symbol('EOF');
let currentToken = null;
let currentAttribute = null;
let currentTextNode = null;// a stack containing the root node
const stack = [{ type: 'document', children: [] }];module.exports.parseHTML = function (html) {
  let state = data;  for (const char of html) {
    state = state(char);
  }
  state = state(EOF);
  // return the DOM tree
  return stack[0];
}...function emit(token) {
  let top = stack[stack.length - 1];  if (token.type === 'startTag') {
    // create the element
    let element = {
      type: 'element',
      children: [],
      attributes: [],
      tagName: token.tagName
    }for (const prop in token) {
      if (prop !== "type" && prop !== "tagName") {
        element.attributes.push({
          name: prop,
          value: token[prop],
        })
      }
    }

    top.children.push(element);
    // comment out this line to avoid the circular structure error when print the DOM tree
    // element.parent = top;    if (!token.isSelfClosing) {
      stack.push(element);
    }    currentTextNode = null;
  } else if (token.type === 'endTag') {
    if (top.tagName !== token.tagName) {
      throw new Error('Tag does not match');
    } else {
      stack.pop();
    }
    currentTextNode = null;
  } else if (token.type === 'text') {
    if (currentTextNode === null) {
      currentTextNode = {
        type: "text",
        content: "",
      }
      top.children.push(currentTextNode);
    }
    currentTextNode.content += token.content;
  }
}

Run the index again, instead of undefined, now we can see a DOM tree here! Under the document node is the html node, below it is the head node and the body node.

{
  "type": "document",
  "children": [
    {
      "type": "element",
      "children": [
        {
          "type": "text",
          "content": "\n      "
        },
        {
          "type": "element",
          "children": [
            {
              "type": "text",
              "content": "\n            "
            },
            {
              "type": "element",
              "children": [
                {
                  "type": "text",
                  "content": "\n      body div #myid{\n        width:100px;\n        background-color: #ff5000;\n      }\n      body div img{\n        width:30px;\n        background-color: #ff1111;\n      }\n        "
                }
              ],
              "attributes": [],
              "tagName": "style"
            },
            {
              "type": "text",
              "content": "\n      "
            }
          ],
          "attributes": [],
          "tagName": "head"
        },
        {
          "type": "text",
          "content": "\n      "
        },
        {
          "type": "element",
          "children": [
            {
              "type": "text",
              "content": "\n        "
            },
            {
              "type": "element",
              "children": [
                {
                  "type": "text",
                  "content": "\n            "
                },
                {
                  "type": "element",
                  "children": [],
                  "attributes": [
                    {
                      "name": "id",
                      "value": "myid"
                    },
                    {
                      "name": "isSelfClosing",
                      "value": true
                    }
                  ],
                  "tagName": "img"
                },
                {
                  "type": "text",
                  "content": "\n            "
                },
                {
                  "type": "element",
                  "children": [],
                  "attributes": [
                    {
                      "name": "isSelfClosing",
                      "value": true
                    }
                  ],
                  "tagName": "img"
                },
                {
                  "type": "text",
                  "content": "\n        "
                }
              ],
              "attributes": [],
              "tagName": "div"
            },
            {
              "type": "text",
              "content": "\n      "
            }
          ],
          "attributes": [],
          "tagName": "body"
        },
        {
          "type": "text",
          "content": "\n      "
        }
      ],
      "attributes": [
        {
          "name": "maaa",
          "value": "a"
        }
      ],
      "tagName": "html"
    }
  ]
}

Look at the DOM tree we have here. The structure is correct, but it is bare, without any decoration.

Step three: CSS computing

Let’s make it a beautiful Christmas tree with CSS in this step!

During the CSS computation step, parsed CSS rules are added to the matched DOM elements. Note that CSS computing also occurs during the construction of the DOM tree.

We need to install the css library which can parse the CSS code into a CSS AST (abstract syntax tree). To simplify, we use the AST to replace the role of CSSOM in our sample project.

npm install css

Gather the CSS rules

First, we need to gather all CSS rules together from the CSS AST.

./client/parser.js

const css = require("css");
...// to save the CSS rules
const rules = []; // gather all the CSS rules
function addCSSRules(text) {
  const ast = css.parse(text);
  rules.push(...ast.stylesheet.rules);
}

Then, execute it at the end of the style tag (ignore CSS in other locations).

So, in the emit() method:

...
else if (token.type === 'endTag') {
  if (top.tagName !== token.tagName) {
    throw new Error('Tag does not match');
  } else {
    if (top.tagName === 'style') {
      addCSSRules(top.children[0].content); // The children of the top element in the stack are the current element
    }
    stack.pop();
  }
  currentTextNode = null;
}
...

If you print the rules, you’ll find that the CSS rules have been collected.

Match CSS rules and elements

When to do it? Generally speaking, we will try to ensure that all selectors can be judged when startTag is entered. With the later addition of advanced selectors, this rule has been loosened. Due to the limited length of the article, we only use simple selectors in our sample code. So when there is a startTag, it is already possible to determine which CSS rules the element matches.

In real browsers, there may be style tags in the body that require recalculation of CSS. We ignore this kind of situation.

Call the computeCSS in the emit function.

./client/parser.js

function emit(token) {
  let top = stack[stack.length - 1];  if (token.type === 'startTag') {
    let element = {...}    for (const prop in token) {
      ...
    }
    // CSS computing happens during the DOM tree construction 
    computeCSS(element);
    
    top.children.push(element);
    ...

In the computeCSS function, in order to match the selector with the corresponding element, we need to get the ancestor element sequence of the current element.

const elements = stack.slice().reverse();

All ancestor elements of the current element are stored in the stack. Because the stack is changing, the slice() method is used to save a copy of the current state.

When we try to determine whether an element matches the selector, we start with the current element, then its parent element, and step by step outward. Take “div #myid” as an example, the div can be any ancestor element, but #myid must be the current element.

In addition, we need a way to calculate whether the selector matches the element. Also for simplicity, we only deal with the case of descendant selectors composed of simple selectors and spaces, like ‘div #myid’.

./client/parser.js

...// assuming selector is a simple selector (.class #id tagname)
function match(element, selector) {
  if (!selector || !element.attributes) return false;  if (selector.charAt(0) === '#') { // id selector
    const attr = element.attributes.filter(
      ({ name }) => (name === 'id')
    )[0];
    if (attr && attr.value === selector.replace('#', '')) {
      return true;
    }
  } else if (selector.charAt(0) === '.') { // class selector
    const attr = element.attributes.filter(
      ({ name }) => (name === 'class')
    )[0];
    if (attr && attr.value === selector.replace(".", "")) {
      return true;
    }
  } else { // type selector
    if (element.tagName === selector) return true;
  }
}
function computeCSS(element) {
  const elements = stack.slice().reverse();for (const rule of rules) {
    const selectors = rule.selectors[0].split(' ').reverse(); //  the structure of the ast and don't forget to do the reversing
    // selectors are like '[ '#myid', 'div', 'body' ]'    if (!match(element, selectors[0])) continue;    let matched = false;let selectorIndex = 1; // elements is parent element or ancestor element, so here we start from 1
    for (let elementIndex = 0; elementIndex < elements.length; elementIndex++) {
      if (match(elements[elementIndex], selectors[selectorIndex])) {
        selectorIndex++;
      }
    }
    if (selectorIndex >= selectors.length) {
      // all selectors are matched
      matched = true;
    }
    if (matched) {
      console.log(`Selector "${rule.selectors[0]}" has matched element ${JSON.stringify(element)}`)
    }
  }
}...

From the printed result, we can see that the element and the selector have been correctly matched.

Selector "body div #myid" has matched element {"type":"element","children":[],"attributes":[{"name":"id","value":"myid"},{"name":"isSelfClosing","value":true}],"tagName":"img"}

After the successful match, we should add CSS rules to the corresponding elements.

This step is very simple. In the computeCSS function, we add a computedStyle property to the element and save the CSS rules into it when matching.

function computeCSS(element) {
  if (!element.computedStyle) {
    element.computedStyle = {};
  }  const elements = stack.slice().reverse();  for (const rule of rules) {
    ...
    
    if (matched) {
      const { computedStyle } = element;
      for (const declaration of rule.declarations) {
        const { property, value } = declaration;
        if (!computedStyle[property]) {
          computedStyle[property] = {};
        }
        computedStyle[property] = value;
      }
    }
  }
  console.log(element.computedStyle)
}

Now the elements have their CSS rules, but it’s not enough yet. Go check these two img elements, you’ll find that they have the same computedstyle property, which is wrong.

{ width: '30px', 'background-color': '#ff1111' }

Why? The reason is the CSS specificity. The rules with higher specificity should always have higher priority. The currently wrong result is because the later low-specificity rules overwrite the previous high-specificity rules.

To calculate the CSS specificity, we need a quaternion: [inline, id, class/attribute, tagName]（Inline styles have the highest priority, and the rest is number of each type of selectors）.

For example, for the case[a, b, c, d], the calculation formula is:

specificity = a * N³ + b * N² + c * N + d

N is a big number. In the old version of IE, in order to save memory, the value of N is 255, which is not large enough, resulting in a funny bug that 256 classes are equivalent to one id. Of course, any sane developer would not write 256 selectors. Nowadays, most browsers set N as 65536.

Based on this, we can implement functions for calculating and comparing CSS specificity. Again, this is also a simplified version.

function getSpecificity(selector) {
  const specificity = [0, 0, 0, 0];
  //Similarly, assuming that selector is composed of simple selectors.
  const selectors = selector.split(' ');  for (const item of selectors) {
    if (item.charAt(0) === '#') {
      specificity[1] += 1;
    } else if (item.charAt(0) === '.') {
      specificity[2] += 1;
    } else {
      specificity[3] += 1;
    }
  }
  return specificity;
}function compareSpecificity(sp1, sp2) {
  if (sp1[0] - sp2[0]) {
    return sp1[0] - sp2[0];
  }
  if (sp1[1] - sp2[1]) {
    return sp1[1] - sp2[1];
  }
  if (sp1[2] - sp2[2]) {
    return sp1[2] - sp2[2];
  }
  return sp1[3] - sp2[3];
}

In the computeCSS function, when an element matches a selector, instead of simply apply the CSS rules, we need to compare the CSS specificity and use the rules with higher specificity.

if (matched) {
      const { computedStyle } = element;
      const specificity = getSpecificity(rule.selectors[0]);      for (const declaration of rule.declarations) {
        const { property, value } = declaration;
        if (!computedStyle[property]) {
          computedStyle[property] = {};
        }
        // computedStyle[property] = value;
        // CSS specificity
        if (!computedStyle[property].specificity) {
          computedStyle[property] = {
            value,
            specificity,
            ...computedStyle[property],
          }
        } else if (compareSpecificity(computedStyle[property].specificity, specificity) < 0) {
          // current CSS selector have higher specificity than the previous, cover the previous rules
          computedStyle[property] = {
            value,
            specificity,
            ...computedStyle[property],
          }
        }
      }
    }

Run it again, as it printed, the img element with ID is correctly displayed as '#ff5000'.

Now, if you print the DOM tree again, you will see a beautiful Christmas tree with all the CSS decoration.

Next, our job is to calculate the position of each element so we can get a DOM tree with position information.

Step four: layout

In CSS, there are three generations of layout technology:

normal flow
flexbox
grid

We are going to implement the most popular one: flexbox. If you are not familiar with this technique, please check this link.

Main axis and cross axis

Before we start, we must first understand the concepts of the main axis and the cross axis. By default, the main axis is horizontal (from left to right) and the cross axis is vertical (from top to bottom). It can also change with our settings. For example, when the value of ‘flex-direction’ is ‘row’, the main axis is horizontal, when the value is ‘column’, the main axis is vertical. Using these concepts can help us reduce a lot of unnecessary if-else code.

When to do the layout

Suppose we already have the layout() function, when should we call it? To calculate the flexbox layout, we need to know all the children elements of the current element. So we should call the layout() function in the ‘endTag’ branch.

./client/parser.js

const css = require("css");
const layout = require("./layout.js");...function emit(token) {
  ...
   else if (token.type === 'endTag') {
    if (top.tagName !== token.tagName) {
      throw new Error('Tag does not match');
    } else {
      if (top.tagName === 'style') {
        addCSSRules(top.children[0].content);
      }
      // right here!
      layout(top);
      stack.pop();
    }
    currentTextNode = null;
  }
  ...
}

Pre-processing

In the newly created layout.js file, let's add a getStyle() function to do some pre-processing work like filtering out unwanted elements and type conversion(string to number).

./client/layout.js

function getStyle(element) {
  if (!element.style) element.style = {};  const { computedStyle, style } = element;  for (const prop in computedStyle) {
    if (style[prop]) continue;
    
    element.style[prop] = computedStyle[prop].value;    if (element.style[prop].toString().match(/px$/)) {
      element.style[prop] = parseInt(element.style[prop]);
    }

    if (element.style[prop].toString().match(/^[0-9\.]+$/)) {
      element.style[prop] = parseInt(element.style[prop]);
    }
  }
  return element.style;
}

Set default values

Set default values in the layout function.

./client/layout.js

function layout(element) {
  if (!element.computedStyle) return;  const elementStyle = getStyle(element);

  if (elementStyle.display !== 'flex') return;  // filter out text nodes
  const elementItems = element.children.filter(
    el => el.type === 'element'
  );  // to support the order property
  elementItems.sort((a, b) => (a.order || 0) - (b.order || 0));  const style = elementStyle;  ['width', 'height'].forEach(size => {
    if (style[size] === 'auto' || style[size] === '') {
      style[size] = null;
    }
  })  // set default values
  if (!style['flex-direction'] || style['flex-direction'] === 'auto') {
    style['flex-direction'] = 'row';
  }
  if (!style['align-items'] || style['align-items'] === 'auto') {
    style['align-items'] = 'stretch';
  }
  if (!style['justify-content'] || style['justify-content'] === 'auto') {
    style['justify-content'] = 'flex-start';
  }
  if (!style['flex-wrap'] || style['flex-wrap'] === 'auto') {
    style['flex-wrap'] = 'nowrap';
  }
  if (!style['align-content'] || style['align-content'] === 'auto') {
    style['align-content'] = 'stretch';
  }  let mainSize,
    mainStart,
    mainEnd,
    mainSign,
    mainBase,
    crossSize,
    crossStart,
    crossEnd,
    crossSign,
    crossBase;  if (style['flex-direction'] === 'row') {
    mainSize = 'width';
    mainStart = 'left';
    mainEnd = 'right';
    mainSign = +1;
    mainBase = 0;    crossSize = 'height';
    crossStart = 'top';
    crossEnd = 'bottom';
  } else if (style['flex-direction'] === 'row-reverse') {
    mainSize = 'width';
    mainStart = 'right';
    mainEnd = 'left';
    mainSign = -1;
    mainBase = style.width;    crossSize = 'height';
    crossStart = 'top';
    crossEnd = 'bottom';
  } else if (style['flex-direction'] === 'column') {
    mainSize = 'height';
    mainStart = 'top';
    mainEnd = 'bottom';
    mainSign = +1;
    mainBase = 0;    crossSize = 'width';
    crossStart = 'left';
    crossEnd = 'right';
  } else if (style['flex-direction'] === 'column-reverse') {
    mainSize = 'height';
    mainStart = 'bottom';
    mainEnd = 'top';
    mainSign = -1;
    mainBase = style.height;    crossSize = 'width';
    crossStart = 'left';
    crossEnd = 'right';
  }  if (style['flex-wrap'] === 'wrap-reverse') {
    const [crossEnd, crossStart] = [crossStart, crossEnd];
    crossSign = -1;
  } else {
    crossBase = 0;
    crossSign = +1;
  }}

Collect elements into flex line

Put the elements in one line if there is enough room or no-wrap, otherwise, start a new line.

There might be a special case: If the parent element does not have a main axis size (like width), the parent element is stretched by the child element. This mode is called the “auto main size”. In this case, the elements are gathered in one line.

./client/layout.js (continue with the previous code block)

  // the special case
  let isAutoMainSize = false;
  if (!style[mainSize]) {
    // auto sizing
    elementStyle[mainSize] = 0;
    for (let i = 0; i < elementItems.length; i++) {
      const itemStyle = getStyle(elementItems[i]);
      if (itemStyle[mainSize] !== null || itemStyle[mainSize] !== (void 0)) {
        elementStyle[mainSize] = elementStyle[mainSize] + itemStyle[mainSize];
      }
    }
    isAutoMainSize = true;
  }
  // normal case
  let flexLine = [];
  const flexLines = [flexLine];  // mainSpace is the current remaining space, set it to the mainsize of the parent element
  let mainSpace = elementStyle[mainSize];
  let crossSpace = 0;  // loop all the flex items
  for (let i = 0; i < elementItems.length; i++) {
    const item = elementItems[i];
    const itemStyle = getStyle(item);

    if (itemStyle[mainSize] === null) {
      itemStyle[mainSize] = 0;
    }

    if (itemStyle.flex) { // one line
      flexLine.push(item);
    } else if (style['flex-wrap'] === 'nowrap' && isAutoMainSize) {
      mainSpace -= itemStyle[mainSize];
      if (itemStyle[crossSize] !== null && itemStyle[crossSize] !== (void 0)) {
        crossSpace = Math.max(crossSpace, itemStyle[crossSize]);
      }
      // because of 'nowrap'
      flexLine.push(item);
    } else { // multiple lines
      // if the mainsize of the element is larger than it's parent element, it will be compressed to the same size as the parent element.
      if (itemStyle[mainSize] > style[mainSize]) {
        itemStyle[mainSize] = style[mainSize];
      }
      // start a new line
      if (mainSpace < itemStyle[mainSize]) {
        flexLine.mainSpace = mainSpace;
        flexLine.crossSpace = crossSpace;
 
        // create the new line
        flexLine = [item];
        flexLines.push(flexLine);
        // reset these two props
        mainSpace = style[mainSize];
        crossSpace = 0;
      } else {
        flexLine.push(item);
      }

      if (itemStyle[crossSize] !== null && itemStyle[crossSize] !== (void 0)) {
        crossSpace = Math.max(crossSpace, itemStyle[crossSize]);
      }
      mainSpace -= itemStyle[mainSize];
    }
  }
  flexLine.mainSpace = mainSpace;

Calculate the main axis

Find all flex items and assign the remaining space mainspace in the main axis direction to these flex items proportionally. Special case: if the remaining space is a negative number (like in the “no-wrap” mode), set the size of the main axis of all flex items elements to 0, and compress the remaining elements proportionally.

If there are no flex items, calculate the position of each element according to the value of justify-content.

./client/layout.js (continue with the previous code block)

if (style['flex-wrap'] === "nowrap" || isAutoMainSize) {
    flexLine.crossSpace = (style[crossSize] !== undefined)
      ? style[crossSize]
      : crossSpace;
  } else {
    flexLine.crossSpace = crossSpace;
  }if (mainSpace < 0) {
    // overflow (happens only if container is single line), scale every item 
    // the special case, proportional compression
    let scale = style[mainSize] / (style[mainSize] - mainSpace);
    let currentMain = mainBase;for (let i = 0; i < elementItems.length; i++) {
      const itemStyle = getStyle(elementItems[i]);
      if (itemStyle.flex) {
        // flex elements do not participate in proportional compression
        itemStyle[mainSize] = 0;
      }      itemStyle[mainSize] = itemStyle[mainSize] * scale;      // if the flex direction is row，this part is to calculate left and right after compression
      itemStyle[mainStart] = currentMain;
      itemStyle[mainEnd] = itemStyle[mainStart] + mainSign * itemStyle[mainSize];
      currentMain = itemStyle[mainEnd];
    }
  } else {
    // multiple lines
    // process each flex line
    flexLines.forEach(flexLine => {
      const mainSpace = flexLine.mainSpace;
      let itemStyle;
      let flexTotal = 0;
      for (let i = 0; i < flexLine.length; i++) {
        itemStyle = getStyle(flexLine[i]);
        // find the flex items to get the value of flexTotal
        if ((itemStyle.flex !== null) && (itemStyle.flex !== (void 0))) {
          flexTotal += itemStyle.flex;
          continue;
        }
      }      // if flex elements exist, distribute mianSpace evenly to each flex element
      if (flexTotal > 0) {
        let currentMain = mainBase;
        for (let i = 0; i < flexLine.length; i++) {
          itemStyle = getStyle(flexLine[i]);
  
          if (itemStyle.flex) {
            itemStyle[mainSize] = (mainSpace / flexTotal) * itemStyle.flex;
          }          itemStyle[mainStart] = currentMain;
          itemStyle[mainEnd] = itemStyle[mainStart] + mainSign * itemStyle[mainSize];
          currentMain = itemStyle[mainEnd];
        }
      } else {
        // if there is no flex element, the remaining space in the main axis direction is allocated according to the rules of justifyContent
        let currentMain, gap;
        if (style['justify-content'] === 'flex-start') {
          currentMain = mainBase;
          gap = 0; // there is no space between each element
        }
        if (style['justify-content'] === 'flex-end') {
          currentMain = mainBase + mainSpace * mainSign;
          gap = 0;
        }
        if (style['justify-content'] === 'center') {
          currentMain = mainBase + mainSpace / 2 * mainSign;
          gap = 0;
        }
        if (style['justify-content'] === 'space-between') {
          currentMain = mainBase;
          gap = mainSpace / (elementItems.length - 1) * mainSign;
        }
        if (style['justify-content'] === 'space-around') {
          currentMain = gap / 2 + mainBase;
          gap = mainSpace / elementItems.length * mainSign;
        }
        if (style['justify-content'] === 'space-evenly') {
          gap = mainSpace / (elementItems.length + 1) * mainSign
          currentMain = gap + mainBase
        }
        // calculate mainend based on mainstart and mainsize
        for (let i = 0; i < flexLine.length; i++) {
          itemStyle[mainStart] = currentMain;
          itemStyle[mainEnd] = itemStyle[mainStart] + mainSign * itemStyle[mainSize];
          currentMain = itemStyle[mainEnd] + gap;
        }
      }
    })
  }

If the flex-direction is “row”, at this point we have the value of width, left, and right. The next step is to calculate the cross axis to get the value of height, top, and bottom. When these are determined, the position of the element is also determined.

Calculate the cross axis

the height of the line based on the height of the largest element
the specific position of elements can be calculated according to the value of flex-align and item-align

./client/layout.js (continue with the previous code block)

  if (!style[crossSize]) {
    // if the parent element has no crossSize, crossSpace will always be zero
    crossSpace = 0;
    elementStyle[crossSize] = 0;
    // add the height of the expansion
    for (let i = 0; i < flexLines.length; i++) {
      elementStyle[crossSize] = elementStyle[crossSize] + flexLines[i].crossSpace;
    }
  } else {
    crossSpace = style[crossSize];
    for (let i = 0; i < flexLines.length; i++) {
      crossSpace -= flexLines[i].crossSpace;
    }
  }  // wrap-reverse    this affects crossBase
  if (style['flex-wrap'] === 'wrap-reverse') {
    crossBase = style[crossSize];
  } else {
    crossBase = 0;
  }  let lineSize = style[crossSize] / flexLines.length;
  let gap;

  if (style['align-content'] === 'flex-start') {
    crossBase += 0;
    gap = 0;
  }
  if (style['align-content'] === 'flex-end') {
    crossBase += crossSpace * crossSign;
    gap = 0;
  }
  if (style['align-content'] === 'center') {
    crossBase += crossSpace * crossSign / 2;
    gap = 0;
  }
  if (style['align-content'] === 'space-between') {
    crossBase += 0;
    gap = crossSpace / (flexLines.length - 1);
  }
  if (style['align-content'] === 'space-around') {
    crossBase += crossSign * gap / 2;
    gap = crossSpace / (flexLines.length);
  }
  if (style['align-content'] === 'stretch') {
    crossBase += 0;
    gap = 0;
  }  flexLines.forEach(flexLine => {
    let lineCrossSize = style['align-content'] === 'stretch'
      ? flexLine.crossSpace + crossSpace / flexLines.length
      : flexLine.crossSpace;

    for (let i = 0; i < flexLine.length; i++) {
      let itemStyle = getStyle(flexLine[i]);
      let align = itemStyle['align-self'] || style['align-items'];      if (itemStyle[crossSize] === null) {
        itemStyle[crossSize] = align === 'stretch'
          ? lineCrossSize
          : 0;
      }      if (align === 'flex-start') {
        itemStyle[crossStart] = crossBase;
        itemStyle[crossEnd] = itemStyle[crossStart] + crossSign * itemStyle[crossSize];
      }
      if (align === 'flex-end') {
        itemStyle[crossStart] = crossBase + crossSign * lineCrossSize;
        itemStyle[crossEnd] = itemStyle[crossEnd] - crossSign * itemStyle[crossSize];
      }
      if (align === 'center') {
        itemStyle[crossStart] = crossBase + crossSign * lineCrossSize[crossSize] / 2;
        itemStyle[crossEnd] = itemStyle[crossStart] + crossSign * itemStyle[crossSize];
      }
      if (align === 'stretch') {
        itemStyle[crossStart] = crossBase;
        itemStyle[crossEnd] = crossBase + crossSign * ((itemStyle[crossSize] !== null && itemStyle[crossSize] !== (void 0)) ? itemStyle[crossSize] : lineCrossSize);
        itemStyle[crossSize] = crossSign * (itemStyle[crossEnd] - itemStyle[crossStart]);
      }
    }
    crossBase += crossSign * (lineCrossSize + gap);
  })

Now, we have a DOM tree with the position data!

Send an HTML with flexbox layout from the server

./server/server.js

...
     
     response.end(
        `<html maaa=a >
      <head>
            <style>
      #container {
        width:500px;
        height:300px;
        display:flex;
        background-color:rgb(139,195,74);
      }
      #container #myid {
        width:200px;
        height:100px;
        background-color:rgb(255,235,59);
      }
      #container .c1 {
        flex:1;
        background-color:rgb(56,142,60);
      }
        </style>
      </head>
      <body>
        <div id="container">
            <div id="myid"/>
            <div class="c1" />
        </div>
      </body>
      </html>`);
      ...

As you can see from the new DOM tree, the positions of the elements are already there. It’s ready to be rendered!

Step five: render

In the field of computer graphics, the term “render” specifically refers to the process of turning a model into a bitmap. Note that some frameworks, like React, call the “process from data to HTML code” as to render, which is different.

The bitmap here is a two-dimensional table built in memory, and the color corresponding to each pixel of a picture is saved.

Here we use the “images” library for painting in the viewport, it supports painting background-color, border, background-image, etc.

npm install images

Render one flex item

In the beginning, let’s try to render one flex item.

./client/render.js

const images = require("images");function render(viewport, element) {
  if (element.style) {
    let img = images(element.style.width, element.style.height);
    // to simplify, only deal with background color
    if (element.style["background-color"]) {
      let color = element.style["background-color"] || "rgb(0, 0, 0)";
      color.match(/rgb\((\d+),(\d+),(\d+)\)/);
      img.fill(Number(RegExp.$1), Number(RegExp.$2), Number(RegExp.$3));
      viewport.draw(
        img,
        element.style.left || 0,
        element.style.top || 0
      );
    }
  }
}module.exports = render;

./client/index.js

const net = require('net');
const images = require("images");
const parser = require('./parser');
const render = require("./render");...void async function () {
  const request = new Request({...});  const response = await request.send();
  const dom = parser.parseHTML(response.body);  const viewport = images(800, 600);
  render(viewport, dom.children[0].children[3].children[1].children[3]); // the element to render
  viewport.save('viewport.jpg');
}()

Go check the viewport.jpg, we rendered the item out successfully!

Full rendering

This step is quite easy, calling the render() method of the children elements recursively can render the whole DOM tree.

Text rendering is difficult, it relying on the font library to turn the font into a picture. We ignore it in our mini-browser project.

./client/render.js

function render(viewport, element) {
  if (element.style) {
    ...
  }  if (element.children) {
    for (const child of element.children) {
      render(viewport, child);
    }
  }
}

Run the index.js and check the picture again, you’ll find that the elements are rendered perfectly.

Let’s change some flex rules to see whether the browser can handle it, like this:

The answer is yes!

We have successfully developed a toy mini-browser that supports HTTP requests, HTML/CSS parsing, and flexbox layout.

After going through these steps, I hope you have a deeper understanding of how browsers works. If you want to view the complete code, please refer to this repository. The index.js in the client-side is separated into a few more files, but the basic structure remains.

src
├── client
│   ├── index.js
│   ├── Request.js
│   ├── HTMLparser.js
│   ├── ResponseParser.js
│   ├── ChunkedBodyParser.js
│   ├── layout.js
│   ├── render.js
│   ├── package.json
│   ├── package-lock.json
│   └── viewport.jpg
└── server
    ├── server.js
    ├── package.json
    └── package-lock.json

Understanding how browsers work by creating one

A browser’s basic work process

Step one: network request

Set up a server

Construct and send an HTTP request

Parse HTTP response

Step two: HTML parsing

Tokenization (lexical analysis)

Setup an empty state machine

Emit tokens

DOM tree construction (syntactic analysis)

Step three: CSS computing

Gather the CSS rules

Match CSS rules and elements

Step four: layout

Main axis and cross axis

When to do the layout

Pre-processing

Set default values

Collect elements into flex line

Calculate the main axis

Calculate the cross axis

Send an HTML with flexbox layout from the server

Step five: render

Render one flex item

Full rendering

Published in jobpal has been acquired by SmartRecruiters!

Written by Yujie Wang