Demystifying Augmented Reality: A Beginner’s Guide to Creating AR Experiences with A-Frame

6 min readFeb 23, 2024

Wouldn’t it be cool if our current lives looked like this? The technology used to make this possible is called augmented reality. But when augmented reality was first invented, it wasn’t nearly as cool as this.

In 1968, Ivan Sutherland developed the first head mounted display system, called the sword of Damocles, which could overlay simple wireframe graphics into the user’s physical world. In 1990, a Boeing researcher called Tom Caudell coined the term Augmented Reality.

After this, from the late 1990s to 2000s, companies like NASA & Boeing started experimenting with augmented reality, and AR-based entertainment experiences like “The Invisible Train” and “ARQuake” gained a lot of popularity.

The release of Pokémon Go in 2016 became a global phenomenon, showcasing the potential of AR for location-based gaming. The game uses a smartphone’s GPS and camera to overlay virtual Pokémon onto the real world, leading to widespread adoption and cultural impact.

So how does augmented reality even work? What does it consist of?

Augmented Reality combines the real and virtual worlds. It overlays digital content — like images, videos, 3D models, or even text — onto the user’s view of the physical world. This digital information is seamlessly integrated with the real-world environment, appearing as if it exists alongside physical objects, and sometimes, the user can also interact with it.

This digital information is displayed using mainly 4 types of AR displays: 1.head worn (using headsets), 2.spatial AR (which uses a projector and camera setup, so users do not have to wear any equipment) , 3.hand held (phones) and 4.monitor based.

There are a lot of ways to display AR content, and there are just as many ways to create this content.

Marker based AR uses a marker to place virtual objects on. This marker could be a barcode, a hiro marker, or even a colourful image.

Marker less AR can be through superimposition, anchoring objects to a specific longitude latitude, or spatially augmenting information.

Okay. So now you know what AR is, and how it can be displayed. But as developers, what we want to know is how to create augmented reality experiences. There are tonnes of options available online.

If you prefer web based applications, there is WebXR, AR.js, Aframe. Then there are codeless platforms like SparkAR and worldcast. And you also have traditional platforms and libraries like Unity, Unreal, Vuforia, ARcore, ARkit and AR foundation. While these traditional platforms produce robust applications, it takes a lot of time to install and set up.

Today, we are going to be using an online coding platform called Glitch, and use Aframe — a very easy coding language similar to HTML to create AR experiences.

Aframe is build on top of HTML and WebGL, which means not only is it very simple to create, but you can also deploy your AR projects in seconds.

Introduction to HTML

HTML is the standard markup language for documents designed to be displayed in a web browser. Every html code contains a file called index.html, and it tells the browser what is the starting point of the website. So this is the file that is always opened first by the internet.

HTML code is written in tags (<> and </> are tags). A tag is like a container for content or other tags. For example, <html> </html> is used to define the document as an HTML document.

Some tags have attributes, which are used to add more information to that tag. For example, <html lang="en"> </html> gives the additional information that the document is in the English language.

Every HTML document starts with the <html> tag, which signifies the beginning of an HTML document.

The document is divided into two main parts: the <head> and the <body>. The <head> section contains metadata and resources necessary for the document but not directly visible to the user. This includes elements like the document title, links to external stylesheets or scripts, meta tags, and more.

The <body> section contains the content that is displayed to the user in the web browser. This includes text, images, links, videos, and other visible elements of the webpage

There are subsections to this, but we won’t be going into them today.

Introduction to Aframe

The Aframe structure is the exact same as the HTML structure that we just saw, with some additions.

In the head, we are importing 2 libraries that are necessary to develop ar projects — the first one is Aframe itself, and the second one is ar.js

Then in the body we have something called a-scene which marks the beginning and end of your AR experience. This is where you can define what your virtual world looks like and get to set up the background colour, lighting, and how objects move or interact.

A-Frame provides a handful of elements such as <a-box> or <a-sky> called primitives that wrap the entity-component pattern to make it appealing for beginners.

To know more about Aframe primitives, read this

Aframe has the Entity-component system:

Entities are container objects into which components can be attached. Entities are the base of all objects in the scene. Without components, entities neither do nor render anything, similar to empty <div>s.
Components are reusable modules or data containers that can be attached to entities to provide appearance, behavior, and/or functionality. Components are like plug-and-play for objects. All logic is implemented through components, and we define different types of objects by mixing, matching, and configuring components. Like alchemy!

To know more about Aframe ECS, read this

Today we are going to use a-box, a-gltf-model, a-camera, and a-entity.

Simply put,

a-box: Creates a cube in our scene. We can customize its size, color, and position to create various objects or elements within our AR environment.
a-gltf-model: With a-gltf-model, we can import and display 3D models of our choice into our scene.
a-camera: The a-camera entity is essential for determining the user's perspective within the AR environment. It defines the viewpoint from which the user sees and interacts with the scene, providing a first-person view and enabling immersive navigation.
a-entity: a-entity serves as a versatile container that allows us to attach components and behaviors to a-entity, making it a powerful tool for creating complex and interactive elements within our AR scene

Some properties you will use the most to customise your primitives and entities are position, scale and rotation. By default, these properties have a value of 0 0 0, corresponding to the x, y, and z axes.

For example, to adjust the scale of a cube, you can use the scale attribute. If you want the cube to have a scale of 1 in all dimensions (maintaining its original size), you would specify scale="1 1 1":

<a-box scale="1 1 1"></a-box>

If you want to increase the length of the cube along the x-axis to 2 while keeping the other dimensions the same, you would adjust the scale accordingly:

<a-box scale="2 1 1"></a-box>

Everything that you learned today, is all you need to know to start creating your own AR projects, which is what we are going to be doing now.

We are going to be making 3 small projects — the first one using marker, the second one using superimposition and the third one using colourful images. Go on the links below to follow along ~

Mini project 1 : Augmenting a box on a Hiro marker using marker based Augmented Reality in Aframe

Mini project 2 : Augmenting a 3D model of a bird using superimposed Augmented Reality in Aframe

Mini project 3 : Augmenting a 3D model on a poster using image tracking in Aframe

Demystifying Augmented Reality: A Beginner’s Guide to Creating AR Experiences with A-Frame

Introduction to HTML

Introduction to Aframe

Written by Rujuta J

No responses yet