What is three.js?

Dusan Bosnjak
12 min readAug 5, 2018

--

I’d like to share some thoughts on how three.js, as a framework, fits the “big picture”. Three.js does a lot of things and it can be somewhat confusing how it relates to other 3d fields. It’s scope is also an ever evolving thing so it’s not super straight forward to sum it up, and these observations are subjective.

JavaScript 3D library

The aim of the project is to create an easy to use, lightweight, 3D library. The library provides <canvas>, <svg>, CSS3D and WebGL renderers.

This is the official description from the github repo. It actually sums it up pretty well, but every subject in this sentence is a broad topic on it’s own and this is not all that three.js does.

Let’s dissect this description:

JavaScript 3D library

The library itself is written in javascript and is intended to be used in a javascript environment. For the most part this means that it will run client side -in a web browser on some device. But with node.js and headless browsers it could also be used server side. The first thought that comes to mind is rendering - maybe some preview screenshots on the server, but it could also be just some 3d computation since three.js has a rich math library.

JavaScript 3D library

This is a super broad term. 3D can mean a lot of things. For the most part we think of “graphics”.

Most three.js projects we see involve realtime 3d graphics where the user’s interaction result in immediate visual feedback. The other type of 3d graphics are either various effects or artificial characters in movies, or various “renderings” you might see printed or in a web catalog (for example, IKEA’s website is full of 3d graphics as all of their product shots were computer generated). I’ve seen this referred to as “offline (3d) rendering”.

A subset of all of this is 3d math. 3d graphics cannot be done without math, and computer languages don’t understand 3d concepts by default. This is where a library comes in, it abstracts those mathematical operations, perhaps optimizes them, and exposes a high level interface such as Matrix4 or .dot() .

Three.js comes with it’s own math library with specific classes for 3d math. There are standalone libraries that deal with this math alone, but with three, it’s just a subset of a much bigger system.

The library provides <canvas>, <svg>, CSS3D and WebGL renderers.

Rendering is the another big responsibility of the library, but this is also where things get a bit tricky. WebGL is pretty special and stands out from this group.

With canvas, svg, and css, three’s responsibility is purely 3d rendering. These APIs have many other libraries to draw non 3d stuff, or actually do so by default (css draws 2d rectangles, canvas various 2d shapes), but need a touch of magic and the 3d math to actually do 3d rendering.

The touch of magic mostly comes in form of the interface abstraction. For example, it’s pretty tricky to manage 3d state of a <div> element that is turned into 3d via CSS. It takes a lot of logic to turn make the canvas API draw something that looks like 3d. WebGL is orders of magnitude more involved.

Three abstracts all these APIs into something as simple as render() but in order to do so it needs a generic representation of what a “3d world” is.

Scene graph

It’s possible to distinguish an area of three.js that serves as this generic “3d world” abstraction. Scene graph is a data structure that is used to describe how objects in some 3d scene (world) relate to each other. It doesn’t actually have to be 3d, as this is a suitable way to describe any vector graphic hierarchy. It’s specifically a “tree” made out of “nodes” with a “root node” that branches out. In three.js the base class for this data structure is Object3D.

This is almost exactly the same as the DOM tree. THREE.Scene would be analogous to <body> and everything else are branches. In the DOM we can position things but are fairly limited. Rotation usually happens around one axis, and we move things left/right or up/down. In a 3d scene graph we have more degrees of freedom.

Three’s scene is more like a virtual DOM. We do our operations and set state on that tree, and when we desire a visual snapshot of that state (say in a continuous loop, or some user interaction/state change), we call render(scene) . You don’t want to update an entire DOM tree when something changes, while with the <canvas> element, we have to clear the entire view, then redraw everything even if only a single element changed position.

A <div> within a <div> would be analogous to the parent child relationship of THREE.Mesh('sun')->THREE.Mesh('earth') . A CSS rule could be analogous to a THREE.Material where a description such as color:'red' causes magic to happen and something to be painted red. Finally calling threeRenderer.render(scene) could be analogous to the browser loading some html page with some CSS rules.

Mesh , Scene , Camera , Light are all sub classes of this one generic class. This is what allows you to add() a “box” to a “scene”, or have a “light” follow a “camera”.

A simple structure can be very flat. The root node can be seen as the “world” and it could have “ground”, “house”, “sun”, “camera” as its children.

THREE.Scene('world')
|-THREE.Mesh('ground')
|-THREE.Mesh('house')
|-THREE.Light('sun')
|-THREE.Camera('main')

This is enough information to give to a renderer in order to obtain a visual result. Relative to some scene, there are two meshes representing different things, there’s terrain, and a house on a hill. One light defining how they’re lit (morning vs noon vs flash light) and one object (camera) that defines our vantage point, a view into this world.

The results may vary, CSS for example being limited to a very stylized rendering, while WebGL could render shadows and overall high fidelity.

Through this structure, the 3D world is managed. If we wanted to simulate how daylight affects the house during different times of the year we would programatically change the light’s position and orientation in the world. The job of the scene graph is to expose this hook i.e. “position”, but in order to actually animate it, you would have to implement your own logic. A simple way of animating a 3d three.js scene is with a “tweening” library.

All of this is probably only true in theory and you may not be able to just switch out renderers on a scene as you please. But this is mostly due to the overlap of “materials” with the scene graph and their coupling with the renderers. For example, there is no way to have a <div> cast a shadow or appear as metal, which is what a WebGL material could describe, but it is possible to make it “red” which is what all materials can do.

Underneath it all Object3D is still generic, and the spatial and hierarchical relation of the nodes to each other is described through the “scene graph”.

In simple English, it’s the scene variable you end up with after you call scene.add(my_foo) a bunch of times.

WebGL

Webgl is super duper special, and is probably used in something like 99% of the three.js apps out there. It’s a big topic so it might be worth doing an overview of the alternatives first.

canvas, css, svg

These are all APIs. It’s an interface that you as a programmer can use to tell the browser to draw certain things.

CSS is the most common interface on the web, since without it, everything would just look like plain text. Historically it had nothing to do with 3D.

Canvas actually uses the same element for drawing as WebGL, but a different context. The context is actually called “2d”, but since 3d is fake anyway and we always draw to some kind of a 2d screen be it real or virtual, we can use this context to draw 3d graphics as well.

SVG is another non 3d API, commonly used to describe things like logos or icons. However since it can describe primitive things such as lines, these can also be rendered in the context of 3D (like an overlay over a map, or space aware UI or HUD elements).

A common thread here is that none of these were intended to be used for 3D.
Another important trait is that they’re all high level — already intended to do something else. For example all three know how to draw a “circle”. With canvas this is an explicit shape, with CSS you might have to use border radiuses and what not, but the end result is very direct access to “a circle”.

Three translates this very high level speak, into yet another high level speak:

Tell me where you want this 3d thing at, and i’ll make sure it renders as a 3d thing right there.

Low level

I’d like to say that WebGL doesn’t know how to draw anything, but that’s not true.

WebGL rasterizes primitives and draws the results into buffers

^ this on the other hand, may sound pretty daunting.

WebGL is very low level, it does not know much about the concept of 3d graphics. 3D graphics require specific mathematical computations to be done, and requires A LOT of them. Just think of your high resolution screen for a second, and how many pixels it has. If you have to run a computation for each pixel to determine how some light affects some surface, and you have to do this 60 times a second, this number adds up.

To mitigate this, there is something called hardware acceleration.

GPU

Most computers such as laptops, desktops, cell phones and even watches have some sort of hardware device that can efficiently compute these 3d operations, and allow for interactive graphics to be rendered. This is called the graphics processing unit.

It’s different from the main processor as it’s made for a different purpose — specific mathematical operations that run in parallel.

The same way we use javascript to program the browser, we use WebGL to program the graphics card.

Well, that’s true in concept, but in practice these are two vastly different beasts. WebGL consists of both javascript code (the commands) and a completely different language that actually does the computation (GLSL). Somewhat of a parallel could be drawn between HTML and JavaScript and how they work together on the same page.

2D and 3D

Not just 3d benefits from this hardware acceleration. Video processing is a good candidate as well. You could program the graphics card to transform colors or distort the image on a live video feed.

Being so low level, WebGL is generic. It doesn’t know about 2d or 3d, but knows about memory, buffers, command queues, shaders, etc.

Dealing with parallel programming is different than how you would program in JavaScript. A common problem is, how to have different threads access a common variable.

This different paradigm means that there is a whole other language involved called GLSL. This is a shader language, that in some form exists in any low level graphics API. This is how you write the actual logic for that massive number crunching, and the only help you get is that you don’t have to write machine code.

The other part of the WebGL api are the javascript bindings that it exposes through which you tell the GPU to do stuff. A shader is “do computation A” and the binding is “run A million times”.

It’s up to the programmer to define that computation A is. It could be something 3D related or a kernel that blurs a video.

When you start to abstract these computations and these commands you end up with three.js.

Renderers working together

One use case that makes a lot of sense is to use a combination of renderers to draw things that they’re good at “in 3d”. WebGL can crunch a lot of numbers and make realistic high fidelity visuals, but is poor at handling text, and even some times lines. An additional layer rendering text could be controlled via the CSS and canvas renderers, while various paths and lines via the SVG.

THREE.WebGLRenderer

All of this low level stuff is abstracted through one three.js class WebGLRenderer . This is what translates cube into bunch of numbers in GPU memory.

Ironically it’s the only three.js renderer that doesn’t have to do 3D graphics exclusively, but is the best suited for it. The others are faking 3D using 2D APIs, the WebGL one is purposly doing 3D by using a generic parallel computation API. But this still doesn’t exclude a scenario where you could use it exclusively to process that live video stream. It abstracts enough of WebGL to make it useful for this task, but you’d probably use a third of the library.

You could build a super responsive UI layer with WebGL, or a super mario type platform game where three.js would still be a great tool.

The fact that you would be using only a third of the library though means that there could be a different tool more suitable for that use case, or that you could build only a subset of three.js. Both the super mario and the video processing examples would perhaps only need the PlaneGeometry and maybe one type of Material.

Lets try to reiterate:

THREE-Math

JavaScript code that does 3d specific math operations. JS has Math.pow() by default but not Quaternion.inverse() . With these classes we can write algorithms that don’t have to be rendered — for example a game server that validates who shot who would do a lot of raycasting but wouldn’t draw anything.

THREE-Scene-Graph

A family of Object3D sub classes that form a tree data structure which describes a “3d world” and the relationship of objects within it. Conceptually it’s abstract, but can be somewhat coupled with the specific renderer once you dive into the code.

THREE-Renderer

A layer that translates that generic graph into a visual representation on some screen or some buffer (say you generate it server side). Uses different technologies to achieve this.

THREE-WebGLRenderer

A specific renderer that allows for hardware acceleration and that knows many 3D concepts but can also be used for 2D (and even just generic computation).

These are IMO the core building blocks of three.js. I’d be inclined to replace the “3D” with “Graphics” but it would only apply in the case of WebGLRenderer.

Practical examples

Hopefully all this theory made some sense. There are some practical examples that could be compiled from various stack overflow questions that illustrate common confusion that users face.

Three.js is not a 3D modeling tool

I’ve loaded a human from human.obj, but now i can’t select it’s arms or feet.

Three.js knows as much about the “human” as you tell it. If you told it to load a “human.obj” try to not think about it as “human” but more “mesh, loaded from a file”.
Imagine that if instead of obtaining human.obj you somehow obtained human_arms.obj , human_torso.obj , human_legs.obj and loaded multiple meshes — it would give you a lot more to work with. Now the problem is “selecting a mesh” and not “selecting humans feet” which was actually “selecting a sub mesh (without knowing what it is)”.

The proper workflow here would be to open the obj in something like Blender, split it up into several meshes using it’s tools, and reexport, either as a single file storing the sub meshes or multiple files.

Three also has nothing to do with something like Open Cascade. You can build a modeling tool using three.js, but three would be used for rendering, and possibly as the building blocks for the modeling engine (math). It can also just straight up be wired to an existing engine.

Three.js is not a game engine

Not everyone who needs 3D (or graphics) on the web is making a game. Game engines typically do a lot of optimizations on top of describing 3d worlds and displaying them. Different games have different needs, both the physics and rendering systems for a real time strategy and first person shooter would probably have very different needs.

All this stuff would mean more code, and for someone who just wants to spin a 3d model as part of a product catalog, this would not only be unnecessary but undesired.

You can of course, build a game engine and use three for rendering, and for building blocks of the engine.

Three.js doesn’t load much

Sure the core has some loaders, for some assets, but all of the common formats like gltf or fbx are standalone. Three doesn’t care how you obtain your assets, as long as you parse them correctly and create THREE objects.

As far as three is concerned there is no difference between a mesh from a gltf file and a procedural sphere. Many creative examples use cubes and spheres and don’t load anything other than three.js itself.

The core loaders are very generic, loading images and files, and direct representations of three’s objects like Material or Texture . Loaders for specific formats are composed with these building blocks.

There is no “X format doesn’t work with three.js” problem. Either the file is invalid, or the loader didn’t work.

Three.js examples are not three.js

Even though some three.js examples are taken for granted as part of three.js they’re not. A common example are various Orbit Controls. Out of the box three.js does not know how to handle mouse input, nor how to apply the orbiting logic to a camera.

Long story short if you go to the repo, anything that’s in the folder /examples is considered an add on. Some are more important than others, since they represent more common use cases. Camera controls are probably more common than an Octree.

If you’re going to take anything three related off the shelf and modify it, you’re probably more likely to end up modifying something in /examples than you are in /src . Ie. it’s more likely that three can solve the problem without modification, but the example does not fit your use case.

Thanks for reading and please leave a comment :)

--

--