Browsers and their working mechanisms

Suman Kunwar
Tech Desk
Published in
5 min readJun 2, 2022

Browsers are one of the most widely used applications. Through its graphical user interface, a browser lets users access and display web pages or other online content. Chrome, Safari, and Opera are some examples of desktop browsers whereas Android Browser, iPhone, and UC Browser are examples of mobile browsers.

Browsers are responsible for presenting the web resources such as web pages, pdf, videos, and other files that one chooses by requesting them from the server. Learning the internal operation of the browser helps to make better decisions with the justifications behind the development of best practices.

Chrome usage is higher worldwide as per StatCounter statistics (as of May 2022).

Browser Components

The browser's main components are listed below.

  1. The User Interface: Users interact with the browser through the user interface. Address bar, back and next buttons, home button, refresh and stop, and bookmark option are some of the examples through which they can interact. Here, things that are displayed in the browser except for the window where the requested page is displayed come under it.
  2. The browser engine: It acts as a bridge between User Interface and Rendering Engine and communicates with the Data Storage Component. In response to inputs from various user interfaces, it queries and manipulates the rendering engine and stores data in Data Storage.
  3. The rendering engine: It is responsible for displaying requested content. The requested content may vary on the different data types and may require plugins. For HTML content, it parses HTML and CSS, and displays the parsed content on the screen.
  4. Networking: It is responsible for handling network tasks such as HTTP requests, Web Sockets, and Web RTC (Uses Real-Time Transport Protocol, which uses UDP). This component may implement a cache of retrieved documents in order to reduce network traffic.
  5. UI backend: It is used for drawing basic widgets like combo boxes and windows. This backend exposes a generic interface that is not platform specific and uses operating system user interface methods.
  6. JavaScript interpreter: This component interprets and executes the javascript code. After interpretation results are sent to the rendering engine. For external script, it first fetches the resource from the network, and holds the parser until the script is executed.
  7. Data storage: This is a persistence layer and is used for storing cookies, cache, bookmarks, and preferences through storage mechanisms such as localStorage, IndexedDB, WebSQL, and FileSystem.

Rendering of HTML pages.

Let’s take an example of this design and render it in a browser.

404 design

The HTML code for the above design is given below.

As mentioned above rendering engines is responsible for display requested content on the browser. Different types of rendering engines are available based on the browser. Internet Explorer uses Trident, Firefox uses Gecko, and Safari uses WebKit. Chrome and Opera (from version 15) use Blink, a fork of WebKit.

The rendering engine receives the contents of the requested document from the networking layer in chunks. Most of these chunks are of 8kB in size. The basic flow of the rendering engine is shown below.

Fig: Basic Flow of Rendering Engine
  1. Parsing HTML to construct the DOM tree

The rendering engine parses the chunks of HTML documents and converts the elements to Document Object Model (DOM) nodes in a tree called the “content tree” or the “DOM tree”. Here, each node represents an HTML tag.

It also parses CSS style present in the webpage and creates a CSS Object Model(CSSOM) which is basically a “map” of CSS style rules in tree-like structure.

The DOM tree representation of our HTML is shown below.

Fig. DOM Tree

2. Render tree construction

The CSSOM and the DOM trees are combined into a render tree. Here, elements of the tree are listed in the order they will be displayed. Starting at the root of the DOM tree, the browser traverses each node. The browser can omit some nodes. For instance: meta tags, script tags, and any nodes are hidden via CSS rules like display: none. The elements in the render tree are called “frames” by Firefox. By contrast, WebKit uses the term “render object”.

3. Layout of a render tree

When the renderer is created and added to the tree, it does not have a position and size. The process of calculating these values is called layout or reflow. Starting a layout means specifying the exact coordinates of each node so it appears on the screen exactly as designed.

The root renderer is at the position 0,0 and its dimensions are the viewport, which is the visible part of the browser window. Each renderer has a layout or reflow method, and each of those invokes the layout method of its children that require layout.

6. Painting the Render Tree

To display the content on the screen, the renderer tree is traversed and the paint()method is called. In order to provide a better user experience, the rendering engine displays content as soon as possible. It will begin building and laying out the render tree before all the HTML has been parsed.

404 html page

Let’s summarise

Now let’s put all this information together again and see the steps involved in rendering HTML by browsers.

  1. The browser starts off by constructing the DOM by parsing all the relevant HTML.
  2. After that, it looks for CSS and JavaScript resources and requests them, which usually happens in the head where we put our external links.
  3. The browser then parses the CSS and constructs the CSSOM followed by running the JavaScript.
  4. Then the DOM and CSSOM are merged together into the Render Tree.
  5. Lastly, we present the page to the user by running the Layout and Painting step.

Resources:

Thanks for reading ! 🙏

--

--

Suman Kunwar
Tech Desk

Innovating Sustainability | Researcher | Author of Learn JavaScript : Beginners Edition