An Introduction to Web Accessibility Part 2: Semantic HTML

In the second part of this series, I’ll introduce the concept of semantic HTML and how using HTML correctly builds the foundation for an accessible website.

Dickson Tan
Government Digital Services, Singapore
8 min readDec 22, 2019

--

The Medium platform does not fully support headings, a hyperlinked table of contents or clickable footnotes. For a better reading experience, read the full article series on one page here.

The techniques discussed here predate the rise of React and other frontend libraries, and are relevant regardless of what technologies are used.

Semantic HTML

Semantic HTML is the use of HTML elements according to what they are, instead of how they render in browsers by default. This enriches web content with meaning.

For instance, the heading tags <h1> through <h6> indicates the enclosed text is a heading, and the heading level communicates the hierarchical structure of the contents of the page. This is semantic because browsers know how to render them and presentational because readers know what headings are. On the other hand, the <b> tag is only presentational because it defines how text should look, but does not convey additional meaning.

Semantic HTML is the foundation of an accessible website. For example, screen readers provide alternative ways of navigating the page, so users can jump between different types of content like links, forms, headings, lists, and paragraphs. This wouldn’t work if divs and spans were used instead, as they are explicitly designed to not convey any meaning.

Using semantic HTML is also a software engineering best practice. It makes development easier by resulting in more readable source, leveraging built-in functionality browsers provide for many elements and simplifying selectors.

Expertise in the correct use of HTML are often not valued or appreciated. Even though HTML is the building blocks of the web, many introductory online materials and university courses don’t spend enough time teaching it in detail. Instead, they promote the misuse of divs and spans, attaching classes like button for styling and adding JavaScript to mimic the functionality of built-in elements. This is the root cause of many inaccessible websites.

Use Native HTML Whenever Possible

Use built-in HTML elements such as <a>, <button> and <select> for interactivity. This increases accessibility and leverages the built-in behaviour provided by browsers. Try navigating with a keyboard on this example webpage with some buttons, a select and some form fields.

  • A <button> is keyboard accessible by default. It can be reached by using the tab and shift+tab keys to move about the page and has an outline to indicate when focus lands on it. Its onClick handler is invoked by using the enter or space keys¹. <button> is also correctly read as a button by screen readers.
  • A <select> is also keyboard accessible by default. When focused, the up / down arrow keys can be used to change the selected item.
  • The <datalist> provides autocomplete functionality when used with the <input> element. Autocomplete suggestions are presented while typing. The arrow and enter keys can be used to select the desired option. No JavaScript is needed for this behaviour.
  • Using the correct type of the <input> element allows mobile browsers to display specialized keyboards to make data entry easier. For example, typing into a <input type="email"> displays a keyboard optimized for email entry by showing an @ key.
  • <details> and <summary> are native elements for implementing disclosures, which are useful for showing additional information on request. They are accessible out of the box, and don’t require any JavaScript.

Many sites use scripted divs and spans to replicate the functionality of native HTML elements. Unfortunately, they typically only account for use with a mouse and such controls are almost always inaccessible. While it is possible to do so correctly, it is easier to use normal HTML. See this example on replicating a button with a div for why this should be avoided.

Note that using scripted divs is not always bad — doing so is necessary when implementing widgets not built into HTML.

Use Headings to Create Logical Structure

Headings should be used to create a logical outline of the page. Good heading structure makes it easier for search engines and screen reader users to understand a web page.

The content under a <h1> should represent the main content of the page, subsections marked with <h2>, subsubsections with <h3> and so on. A heading's level should never increase by more than 1. For example, using a <h2> followed by an <h4>.

You can use the a11y-outline extension to visualize the heading structure of a web page.

Watch out for these antipatterns:

  • Use of heading tags for formatting. It is very common for websites to use e.g a <h4> just for how it is styled in browsers even though the enclosed text is not logically a level 4 heading.
  • Pseudo-headings that can only be perceived visually. For example, use of a <b> or a <div> with CSS for section headings.

Avoid Using Links Without a Valid Href

An <a> element without a valid href is not a valid hyperlink. Hence, browsers exclude them from tab order, making them inaccessible as they can't be activated via the keyboard. The default blue underline will also not be automatically applied for invalid links.

Code like <a onclick="...">my link</a> where script that performs the navigation is run on the onClick is very common, especially in Single Page Applications (SPAs).

This is usually caused by the following:

  • It can be difficult to decide when to use links or buttons. Use a <button> if an action is performed on click, but does not navigate you to a different page. Use an <a> to navigate to a section within the current page, or to a new page. Also avoid styling links to look like buttons and vice versa, as this causes confusion.
  • In SPAs, it is common for links to be rendered without a valid href,, with an onClick handler performing the actual navigation. If you see this in your own code, lean on routing in your application to use a proper href, and let browsers do what they are designed for - following links.

Invalid links without hrefs can still be made accessible by replicating the lost keyboard functionality that browsers provide, but this should only be a last resort:

  • Add tabIndex=”0" so that the invalid link gets included in tab order.
  • Attach a keyboard handler so that the action performed on click is also triggered when pressing the enter key.

For more on using links effectively, see WebAIM’s page on Links.

Programatically Label Form Elements

Associate labels with form elements like text fields, radio buttons, check boxes and buttons. This is usually done with the <Label> element and the for attribute:

When a form field’s purpose can be identified by visual cues so a visible label would be redundant, the aria-label attribute or a visually hidden <label> should be provided for screen reader and voice input users.

The <label> element can also be used as a container to label its form field:

Browsers allow clicking the label to select its corresponding field, which is especially useful on mobile devices with smaller screens. Correctly labelling form elements provides a better experience for voice input users, so that commands like “click first name” work correctly. Labelling forms makes it much easier for password managers to autofill them accurately, since they don’t have to resort to heuristics to figure out which field corresponds to which information is requested. Similarly, screen reader users also need to guess what label text is associated to which control when fields are not properly labelled.

Use the HTML5 Sectioning Elements

We use layout, spacing and colour to visually separate web pages into distinct sections such as a navigation bar, the main contents, supplementary asides and a footer with copyright information.

Replacing the use of the <div> container with semantic sectioning elements introduced in HTML 5 makes these relationships visible when not rendered visually. Screen readers present this information so their users can immediately comprehend the overall layout of the website, and move quickly to a particular section. Many other types of software rely on semantic sectioning elements to find and interpret the main contents of a page. Examples include search engines, read it later services like Pocket, Watch OS 5's display of web pages and the reader mode in browsers.

You can also use the a11y-outline extension to visualize the structure created by sectioning elements on a web page.

The most common sectioning elements include:

  • <header> typically for a section containing the website's main logo, title and navigation.
  • <nav> for sections of site-oriented navigation links. This may be nested in a <header>.
  • <main> for the section containing the main contents of the page. Avoid having more than one <main> element on a page.
  • <article> for self-contained content that would make sense outside the context of its surroundings.
  • <footer> usually for a page footer containing copyright and license information.

Here is how these elements would be used on a typical site.

Use <table> For Tabular Content

Use the <table> element for tabular content. This makes it easier for software to interpret the data.

Screen readers rely on well structured tables heavily to provide a good reading experience. They provide specialized hotkeys for users to navigate by cell, move to the next or previous row and column, which mimics visually scanning horizontally or vertically. Using the <th> element for column headers causes announcements of the name of the column being moved to, which is equivalent to visually scanning upwards to check its title.

Avoid nesting tables within tables, as it is more difficult for software to parse and for users to understand.

Do not misuse <table> for visual layout. Use CSS instead because this is what CSS is designed for. Unfortunately, tables are often misused in emails for layout. This results in poor HTML which don't render well in some email clients. It also causes screen readers to attempt to interpret the entire email as a data table, making it challenging for their users to read such emails. Listen to the audio embeds on Litmus's blog post to experience how difficult it is to read emails with bad markup compared to one with clean HTML.

Besides tabular data, information that is spread across 2 axes should use the <table> element with styling to remove unwanted cell boarders if needed. The display of threads in a forum typically has the thread's title, its author, the number of posted replies and when the last reply was posted. If divs were used for each cell, it prevents efficient navigation scenarios for screen reader users like scanning downwards to rapidly see the titles of new threads or moving to the next column of information. To hear this with a screen reader, compare reading the statuses of pipelines being run on GitLab with reading the examples in the documentation for <table>².

Using semantic HTML is both an accessibility and software engineering best practice. It enriches web content with meaning, enabling the presentation of information in alternate formats to suit users, automated search engine crawlers and other programs. Using semantic HTML also reduces development time by not reimplementing built-in elements.

The next part of this article will cover practical strategies you can use to test the accessibility of your websites during development.

  1. Native HTML elements are keyboard compatible by default. Despite its name, the onClick handler is also invoked when such elements are keyboard activated.
  2. Efficiently reading the list of pipelines being run on Gitlab with a screen reader is extremely inefficient due to the misuse of divs here. It is impossible to scan downwards to read the list of merge requests triggering pipelines or rapidly find pipelines that have failed. You might also notice many other issues on that page, but that isn’t the main focus here.

--

--