Mapping the archipelago of the Web
Traversable 3D representations of the web using generative landscapes
The web is almost too complicated of an accumulation to explain in simple terms. Very roughly, it is a series of devices connected with wired and wireless means, exchanging data on many different layers and with a multitude of different protocols. Web browsers are only one type of interface we have with the web, arguably the most used and the most useful. Every time a web browser’s URL (Uniform Resource Locator) is changed a whole series of fortunate events has to happen in layers and in order, for the response from another device, the server, to arrive in the form we all expect.
This is the form of a web page. And it is still a page, because user experience has been using the metaphor of a desk(top) for ever.
What if that form was not a page though. What else could it be.
In 2007 I saw a video made by cultural anthropologist and digital ethnographer Michael Wesch. In it, Prof. Wesch was making the point that form and content on the web can be separated. Today this video shows its age with screen grabs of Internet Explorer 6 (?), mentions of Web 2.0 and a definitive lack of JSON which, compared to XML, is the data object format of the Gods . And even though the message was so strong on a different level, the one of digital ethnography, I was transfixed by that one sentence:
Form and content on the web can be separated.
So I packed my metaphorical bags and went to my advisor then, Prof. Tomimatsu of Kyushu University with a grand plan. My thesis would be about a radical new way of looking at the web. The idea was to create something that a user can walk through, explore and traverse. The undertaking was too big, so instead of digging myself a PhD hole with a MDes shovel, I did some Augmented Reality stuff. I think one of the reasons that both I and Prof. Tomimatsu did not see this through was that VR was not nearly as close in 2008 to what it is today.
And for a time, it was good.
The idea of course stuck in my head. It wasn’t until the new surge of VR technologies that this investigation came back to the foreground so I had to define it. The hypothesis: to find one or many suitable representations of the web, from the page domain to the domain of 3D space and back. The constraints: 1) the new metaphor in 3D space should apply to all current web pages, and 2) it should be a 1 to 1 function, so it can inform the design of web experiences as 3D traversable spaces to begin with. From 3D space back to web content.
The DOM in space
My first attempts had to be in 2D space. One less dimension in any problem is easier to explore and use as a drafting board. I didn’t even know how a website would look as a 2D graph at that point let alone anything in 3D. The search yielded two interesting applications. The first was Marcel Salathé’s Java applet that could actually run in a browser  , and the second was Oleg Burlaca’s HTML as graphs Perl script . I could already see these being very helpful in my experiments, but the disappearance of the first and the fact that I desperately avoid reading other people’s Perl code (for the second) led me to finally make my own implementation that could run in a web browser. It was  a very similar representation of a web page’s DOM (Document Object Model) as a 2D force-directed graph using the sigma.js library .
What’s a DOM?
We’ll be using this DOM reference often from now on, so here is some solid quote time.
The Document Object Model (DOM) is a programming API for HTML and XML documents. It defines the logical structure of documents and the way a document is accessed and manipulated. In the DOM specification, the term “document” is used in the broad sense — increasingly […]
With the Document Object Model, programmers can create and build documents (and) navigate their structure […]
One subtle important difference between between graphs and trees, is that trees do not contain cycles or loops. The DOM as a graph can contain paths from one node that can end up to itself. These can be made by links called URL fragments (e.g. href=”#chapter2”). So conceptually the DOM graph can have cycles and loops, hence it is not strictly a tree. This is difficult to represent in 3D as the same physical space would have to connect to two places at once. Easily solvable in an experience with a 4th dimension or a teleporter of course, but I was still getting used to 2 of them at that point.
The simplest possible landmass that is separated by all other land masses is an island. That was a good enough place to start, but there was more. The most important feature of a webpage, the link, semantically makes everything inside it also a link. If a link includes a paragraph, a title and an image they all behave as the same link. It is a fortunate coincidence that links are usually on the edge of the graph, and that edge is the visual shore of the island, where the links (boats) to other islands lie.
How to create islands out of HTML
In the current state the server takes the URL that the user inputs, fetches its DOM and computes a couple of important things. These are some graph metrics on each node (depth in the DOM, etc.) and the maximum depth of the leaves. The client then has all the data it needs to select a type of visualisation (2D graph, 3D graph, or an island) and start the rendering.
Currently in the simplest form, each vertex represents an HTML node, its x and y being computed by a 2D force-directed layout using the ngraph.forcelayout library . Its z coordinate (height) corresponds to how deep it lives in the DOM. Every run of the code should produce the same structure with only the physics engine providing a small random factor, thus preserving distinctness. A fresh Delaunay triangulation is applied to the 2D graph with each added node and the whole 3D model is rendered using three.js.
The 2D and 3D non-island graph representations were only steps in the direction of the landscape idea. The importance of many iterations in a very experimental visualisation idea stems mostly from their artefacts, their faults, their weirdness if you may. For example, the moment I realised that most websites will start with a couple of dandelions (the <head> tag has many childless nodes) I knew this would be either very unnatural in a generative landscape or it would have to be tackled as an edge case.
As mentioned before, the foremost use of these kind of 3D visualisations of the web is for consumption in VR.
This project is very much in the beginning now and the first small milestone has just been reached, that is generation of structures that resemble land masses at sea. It is important to render these low polygon islands well and visualise somehow the quantity and quality of the data in the HTML nodes.
The next phase will be to try to slowly introduce replacements or supplements of the web content in 3D space using machine learning and an object API like Google Poly or a library like Sketchup’s 3D Warehouse. When the logic knows this is a page about chairs, you get chair objects on your island as a visual cue.
From then on the possible paths diverge a lot but the only definite constraints are the 3D traversable space, every DOM should always create the same space, and if all the relevant information is passed we can transform back to the original DOM (1 to 1 function).
More questions than answers
I’m slightly aware that a lot of speculation around VR and the Internet is how the already vast space of web content can be consumed in this new immersive interface. While not claiming to have a solution, or an answer that projects a solution, I believe there might be a right question to this.
What if you could design web experiences that are based on a new 3D spatial metaphor for the web. What if this metaphor is concise enough to suggest the opposite route, from 3D to the DOM. What if the new skill the web will eventually need is landscape design, architectural design or interior design.
In the next article we’ll see a couple of complete uses of this idea for web browsing in VR!
-  The original aharef tool link (Chrome detects malware). Mentions of the aharef tool and discussion here and here, including a comment that the tool was unpublished in 2013.
-  Oleg Burlaca’s HTML as graphs in Perl (follow-up post here).
-  First public mention of html2graph.
-  Sigma.js, by Alexis Jacomy and Guillaume Plique.
-  ngraph, by Andrei Kashcha.