An Introduction to HTML: The Backbone of Front-End Development

CAROLINE
4 min readMay 2, 2023

--

The first and most predominant component in front-end applications is HTML. HTML is at the heart of every web page we see on the internet.

It contains the basic building blocks of every page, including headings, forms, images, and many other elements.

The web browser, in turn, interprets these elements and presents them to the end user.

Here’s a very simple example of an HTML page:

<!DOCTYPE html>
<html>
<head>
<title>Page Title</title>
</head>
<body>
<h1>A Heading</h1>
<p>A Paragraph</p>
</body>
</html>

This would look like this:

As we can see, HTML elements are presented in a tree-like structure, similar to XML and other languages:

HTML Structure

Each HTML element can contain other HTML elements, while the main HTML tag contains all the elements of the page, which falls under document, distinguishing between HTML and documents written for other languages, such as XML documents.

The HTML elements above can be visualized as follows:

Each HTML element is opened and closed with a tag that specifies the type of the element. For example, <p> for paragraphs, where the content would be inserted between two of these tags.

Tags can also store ids or classes of elements, such as <p id=’para1'> or <pid=’red-paragraph’>, which are necessary for CSS to format the elements correctly. Both tags and content make up the entire element.

URL encoding

An important concept to learn in HTML is URL Encoding or percent-encoding. For the browser to correctly display the page content, it needs to know which character set (charset) is being used.

In URLs, for example, browsers can only use ASCII encoding, which allows only alphanumeric characters and some special characters. Therefore, all other characters outside of the ASCII set must be encoded in the URL.

URL encoding replaces non-safe ASCII characters with the symbol % followed by two hexadecimal digits.

For example!

The single quote character ' is encoded to %27 which can be understood by the browser as a single quote. URLs cannot contain spaces and therefore replace the space either with a + or %20.

Some common encoding characters are:

The complete table can be seen here.

Various online tools can be used to encode and decode the URL. Additionally, the Burp Suite proxy has an encoder and decoder that can be used for conversion between various encoding types.

Try encoding and decoding some characters and strings using this online tool.

Usage

The <head> element usually contains elements that are not directly printed on the page, such as the page title, while all the main elements of the page are located in the <body>.

Other important elements include <style>, which stores the CSS code of the page, and <script>, which loads the JS code, as we will see in the next section.

Each of these elements is called the DOM (Document Object Model). The W3C defines the DOM as:

A platform and language-neutral interface that allows programs and scripts to dynamically access and update the content, structure, and style of the document.

The DOM standard is divided into three parts:

→ Core DOM: the standard model for all document types

→ XML DOM: the standard for all XML documents

→ HTML DOM: the standard model for all HTML documents

For example, from that previous tree structure, we can refer to DOMs as document. head or document.h1, and so on.

Understanding the HTML DOM structure can help us understand where each element we see on the page is located, allowing us to view the source code of a specific element on the page and look for potential issues.

We can locate HTML elements by their id, tag, or class.

This is also useful when we want to use front-end vulnerabilities like XSS to manipulate existing elements or create new elements that serve our needs.

Thank you for reading my content.

Check out my other works: bento.me

--

--