The Web-Wide World

Mark Pesce
GHVR
Published in
18 min readApr 6, 2017

--

PART ONE: THE GREAT WORK

Twenty-three years ago, in a much smaller theatre on the other side of the world, I addressed my first World Wide Web Conference — the first World Wide Web Conference — demonstrating a new protocol used to discover the three-dimensional worlds that would soon be part of the Web, and showing off a new file format for those worlds, VRML.

One of the few photos available online of the first Web conference, May 1994.

Three years ago, as the twentieth anniversary of that conference approached, I looked high and low for any record of the event. Beyond a few brief posts and a few low-resolution photographs, I found nothing.

And therein lies a tale.

It’s difficult to conceive of a world before the Web. I lived more than half my life in the pre-Web era, and yet even I find it difficult to cast my mind back to that time.

That’s a bit weird, isn’t it? Things have changed so much since the Web — changed so much because of the Web — we have effectively entered a new historical time period, cut off from the past.

“The past is a foreign country — they do things differently there.” It’s a country we know only dimly. That’s not because it lies over twenty years past. Rather, it’s because the Web has become the medium of documentation.

The reason I can’t find hundreds of posts and thousands of photos from the First International Conference on the World Wide Web was simply this: the creation of the medium to host all of that content was the topic of the conference.

This is the same room at CERN where they later announced the Higgs boson.

Effectively, that conference took place in a prehistoric era. Before humanity learned how to keep proper records. In order to explore the unrecorded aspects of such a seminal event, paleoanthropologists will one day look through middens for important clues to life as it was once lived.

No, really.

They’ll find that humanity had an entirely different relationship to knowledge.

Beyond what could be carried in one’s own head, it mostly existed in curated repositories, most of these repositories were personal, some were corporate, and a precious few were public.

The most comprehensive of these repositories — spanning everything they could encompass — quickly grew too large to be usable. You had to know what you were looking for in order to find anything.

It was a dark age.

The “Mother of All Demos” defined how computers work today. 50 years ago.

Forty-nine years ago, in the ‘Mother of All Demos’, Douglas Engelbart demonstrated the first hypertext system, forging organic linkages between data sets. Engelbart saw this as humanity’s last best hope for survival against a rising wave of unfathomable complexity that threatened the continuation of civilisation.

Yes, he really framed it in those terms.

Long before hypertext existed at any meaningful scale, people believed in its promise: to create a vast, ‘universal library’ of linked knowledge.

These qualities are orthogonal. Linked information is valuable in itself, but obeys Metcalf’s Law — its value grows with the square of the number of users.

I learned this when I wrote my own hypertext system back in 1986, and quickly found that a twenty megabyte hard disk didn’t contain nearly enough information nor enough users to be generally interesting. Even the hundreds of megabytes available on a CD-ROM felt insufficient to the task.

Something about hypertext demands an ‘all in’ methodology. If you’re going to link data at all, you’ve got to link all the data. Everywhere. A concept that was broadly unthinkable in 1986, before computers had network connections.

By 1989, that was a different story.

That was the year — I remember it well — when the speed of America’s transcontinental Internet link — between the universities on both coasts — was upgraded from a pokey 56 kilobits to a blazing 1.544 megabits.

(Just as an aside, the broadband connectivity to my smartphone is two hundred times that.)

The original paper by Sir Tim Berners-Lee that invented the World Wide Web.

And it was the year a 33 year-old researcher at CERN wrote ‘Information Management: A Proposal’:

It discusses the problems of loss of information about complex evolving systems and derives a solution based on a distributed hypertext system.

The twenty years from the ‘Mother of All Demos’ to the World Wide Web are the same twenty years it took for the Internet to grow from a proof-of-concept to a resilient platform for communications. That’s no accident.

I had my first encounter with the Web at SIGKIDS in July 1993, in a showcase of educational technologies. Idly clicking a few links, I quickly realised there wasn’t a lot of there there, and dismissed the Web as a toy.

I’d overlooked or misunderstood the most important element of what I’d just seen: It was completely open.

Anyone could add their own there, there. And within a few months, everyone did.

It’s important to understand the specifics of this process, both at an individual and at the broader cultural level, because what happened next illuminated a process that had been too slow and uninteresting to be of interest to any except a few philosophers of science.

It worked something like this: someone would find something on the Web which interested them. This never happened by accident. The Web was too obscure for that.

Instead, someone who had an interest in something would use the Web to share their enthusiasm with someone else.

Sharing reinforces the social bond between individuals, and the Web provided a way to ‘lean into’ a natural human trait. But that sharing had a second order effect, because it frequently lead to learning. When someone points you to some bit of knowledge that has value to you — because it aligns with your interests or your needs — you take that knowledge on board, learning from it.

“All doing is knowing, and all knowing is doing.” Everything you do reflects everything you’ve learned, so when someone shares with you, it reinforces the social bond, but when you learn, it reinforces the human drive toward self-actualisation.

Now you have two drives — one external and social, the other internal and self-oriented — but each now amplifies the other. More sharing leads to more capacity, leading to more sharing, leading to more capacity, and on and on and on.

This process defines the boundary that divides the prehistoric from the current era. All of this happened effectively instantaneously: As we ‘fell into’ the singularity of knowledge culture, we closed the door to an era of information scarcity.

We have become a sharing culture because of the Web, and because of that, we have become a learning culture.

We can share everything, we have yet to learn to winnow truth from lies.

We do not always translate information into knowledge, nor do we always improve our understanding by what we learn. We are perhaps finally coming to understand that there are nuances here that will take us generations to master.

There is more work before us. No one questions that. It’s the reason we’re all here in Perth.

But at the same time, this is a good moment to look back on the sweep of the last generation, to see where we’ve come from, and how unfathomably distant it feels. We are different, and the Web is the principle mechanism of that difference.

PART TWO: THE GREAT HOPE

We’ve come far — and we have a long way to go. We have become sharing beings, but there remains a broad gap between sharing and learning, which means there’s a gap between our capacities and our potential. We’ve done good, but we must do better. We must learn how to learn.

We live within a sharing culture. The most visible consequence of that can be seen in the transition from a knowledge-scarce to a knowledge-abundant culture. We only rarely have too little information to guide our actions. More often we have too much.

Information shapes what we do. Most of the time we won’t do something obviously stupid — if we’ve been informed. Information guides us, and because, on the whole, we’d prefer to make the best decision in any given situation, the more we use information, the more we come to rely on it.

Thirty. Trillion. Pages. (and that’s already 3 years out of date!)

In the overall sense that ‘more is better’, over the last generation we’ve built an amazing global repository of information — somewhere in the trillions of pages, and likely reaching trillions of words — that we use to inform our progress through this world. That could be as simple as today’s weather report — or as detailed as a climate model spanning the next fifty years.

This pressure driving us toward information steadily increased throughout the first fifteen years of the Web. We circled back to our desks, clicking in browser windows, looking for exactly the information that would help us do better. That pattern became normal human behavior almost instantly. Unprecedented, but such was the pressure of all that information we took it as perfectly normal.

One of the basic principles of anthropology is that any cultural innovation is fundamentally conservative. Innovation helps us to do what we need to be doing — even though it changes how we do it.

That pressure grew and grew and grew, until finally all of that informational pressure took a physical form in an innovation that is the physical manifestation of the Web, and our sharing culture.

The smartphone.

Even Jobs didn’t truly understand what he’d created in iPhone — a mobile Web.

If you go back and watch the keynote where Steve Jobs introduced iPhone, you’ll find a peculiar lack of emphasis on Mobile Safari. Jobs spends all of about 3 minutes touting the virtues of a full Web browser scaled down to the palm of your hand.

No one — including Jobs — understood that putting the Web into the hands of billions was the whole point of the device. Music, video and even text messaging are all great, but the Web is the thing that made the smartphone the must-have tool for what will soon be eighty percent of all adults on the planet.

The mobile Web is the point where the Web goes from being ‘over there’ to everywhere. Ubiquity means the information to make the best possible decision is always at hand.

For all of the rest of human history we will all have all of humanity’s knowledge and experience to draw upon at every moment.

An entirely gratuitous shot of me with Sir Tim.

That’s the Great Work of Sir Tim Berners-Lee.

Innovations preserve, but they also disrupt. The pressure of a global, distributed network of information drove that network into our hands, where it now demands the lion’s share of our attention. We’re continuously connecting, sharing and learning.

That’s all to the good where it increases our capacity, but it’s also a zero-sum: attention spent staring into a screen can not be lavished on anything else.

We’ve become conscious of this wrestling for limited resources every time we see friends or family or colleagues attending to their devices. And feel guilty when we do so ourselves.

We moved from knowledge-scarce to knowledge-rich. In order to make ourselves attention-rich, we have to make better use of our attention.

The Web we’ve got today is mostly text, with some images. Even for individuals with high degrees of literacy — which won’t be ever one of the five billion who will have Web access at the end of this decade — that’s a serious amount of reading.

Digesting large amounts of text comes with serious cognitive load.

We’re all suffering from some degree of tsundoku (積ん読), the unread books and tabs and articles and posts and status updates etc that keep piling up.

We want to be on top of things. We need to be on top of things. But things to read and absorb and use are piling up faster than ever before, leaving us perpetually behind.

Although this condition is new to most of us, it is not altogether new.

Sixty years ago, jet fighter pilots suffered from this same cognitive load as they stared at a cockpit of instrumentation, all of it dynamic, meaningful, and vital. Make the wrong decision and lose the battle.

Sutherland’s Ultimate Display is the first VR system. It’s nearly 50 years old!

Tracing a through-line from these jet fighters, to Ivan Sutherland’s ‘Ultimate Display’ and NASA’s Virtual Environment Workstation, we arrive at the current state of the art with kit like Microsoft’s Hololens.

Overlooked in all of that hardware is a more subtle shift, using visualisation techniques to reposition cognitive load away from the parts of the brain that parse the written word — and language with all of its endless ambiguities — toward the massive cognitive resources given over to our visual cortex.

Visualisation is like using the human GPU rather than our CPU. It won’t work for everything, but for many things it’s far more efficient.

That’s not news.

Virtual NYSE gave financial traders the capacity to make decisions 5000 times faster.

Twenty years ago I saw a vast project, written in VRML, running on million-dollar graphics supercomputers, rendering the equivalent of pages of text output from a Bloomberg Terminal — used by financial traders around the world — into a three-dimensional space. Virtual NYSE put a trader into the middle of the data.

Why do that? When compared side by side — the Bloomberg text versus that same data, visualised, traders could assess and make decisions up to five thousand times faster. That’s the kind of win you can get from visualisation.

This is something we all intuitively understand when we come across a really well-designed infographic. Our minds relax as the cognitive load dips, and we engage those parts of our cognition best suited to this information.

One of the biggest questions hanging over the modern Web is why visualisation plays such a small role within it. For something that offers such an obvious win, you’d think it would be used far more. In one area — page layout — the Web has matured remarkably as a visual medium. We know exactly where to put things to garner attention. But we’re still placing text.

Yet if we look at messaging, we can see in the explosive growth of emojis that we engage other parts of our cognition when given the chance.

So there’s a disconnect between the Web we have — which is largely text — and the Web we need, which is richly visualised.

Genius.

At the far end this means something like a Gibsonian cyberspace or Neal Stephenson’s metaverse.

We may get there, someday. But we’re a long way away from that, because for the last twenty years we’ve focused on one specific area of visualisation. Before we jump into the deep end, we need to broaden our capacities.

We need to be employing 2D and 3D visualisations everywhere.

Right now we’d don’t. A lot of that is because we’ve educated a generation of designers who are really good at layout and but lack the skills to render data into effective visualisations.

Fortunately, we can fix that. As it turns out, we have a medium at hand that promotes the sharing of insights, techniques and best practices — the Web. We can use the Web as a knowledge amplifier and capacity builder for the millions who need to learn the visualisation techniques that will take the Web to the next level of engagement, and lighten the cognitive load of so much text.

We’ve done this before. We’ve already transformed into a knowledge-rich culture. That’s created a pressure that can be alleviated by a transition into a visualisation-rich culture.

The 3D Web is already a reality. There are many tools.

It’s the next great work — and it’s ours. This is the work of our generation. We have a Web, and now we need to bring all of our intelligence to bear to make that Web as broadly useful as possible to as many as possible.

And we need to start doing this immediately. A picture’s worth a thousand words. We can amplify our capacities, using visualisation to ‘lean into’ our natural gifts.

And just to be clear, I don’t really mean our eyes. ‘Visualisation’ is a catch-all term for something more properly called ‘sensualisation’. We need to bring all of our senses to bear — eyes and ears and touch and smell and taste and proprioceptive — putting them to work on the Web as we do in the richness of the real world. We have many gifts, and no two individuals have precisely the same arrangements of gifts. The Web we’re moving toward is more sensual and more personal.

The Web is coming to live ever closer to us, in smartphones and smart glasses and smart headphones and smart environments. Our world is not text, and the Web, as it becomes more a part of this world, must become more like the world.

Now, let’s talk about that world.

PART THREE: THE GREAT SILENCE

So we come to the present moment. We have an entire generation of the Web behind us, twenty-eight years of work, starting with Sir Tim, all the way to this room, today.

That work has been so successful — beyond any prediction or expectation of the 380 researchers who crowded into an overfull lecture theatre at CERN twenty-three years ago — that we can feel the pressure of the Web as something that’s absolutely substantial and almost tangible.

Yet all of that work remains somehow ‘over there’ — sealed away in what we used to call Cyberspace, or the noosphere, and what we just call online.

It’s come to the point where we feel we inhabit a multiverse composed of two orthogonal universes that never touch — the real world and online.

Sealed away, each universe is largely deaf to its companion. They don’t touch, they don’t interact, they certainly don’t share.

If you listen for the Web in the real world, all you hear is silence.

That’s about to change.

Something happened last year — something both phenomenal and meaningless, both fad and future: Pokemon Go.

For a month it seemed like everyone was playing it, everywhere. And why not? Pokemon Go rewrote the real world with elements it imported from online, creating an ‘augmented’ reality hybrid of both.

That’s when all the trouble began.

Just because you can play doesn’t mean you should.

It wasn’t long before reports were received of people eagerly playing the game in all sorts of places where it was, quite frankly, inappropriate. The most notorious of these being the death camps at Auschwitz.

How could something so profoundly offensive emerge as a consequence of such joyful play? In essence, the game acted on its own knowledge of the world around. The world had no way to speak to the game — because the online and the real world occupy two separate universes.

When all of these reports surfaced in the media, something began to nibble at the back of my mind, something that took me back all the way to my first years working in virtual reality, studying the hard problem of permissions.

When you’re the only person in a virtual world, it’s yours to do with as you please — and that’s still largely the case for many virtual worlds. But as soon as you add even one other person, you need to have a deep and careful think about the permissions in that world.

It’s analogous to the design differences between single-user operating system like Windows 3.1, and a multi-user operating system like Unix. In a multi-user system, every element of both the file and operating systems has permissions that dictate who owns something, who can use something, who can see something — and what they can do with what that. Forty years of settled computer science.

From Pokemon Go we’ve all learned that we need something similar for the real world, and we needed it yesterday.

We needed it 25 years ago, too, and the foundations for a permissioning system for virtual reality formed the core of the paper I submitted to the First International Conference on the World Wide Web.

At the time, people couldn’t get their head around the use case. It was too far out. Instead, everyone focused on the visualisation layer built atop this permissioning system — that became VRML.

Early VRML worlds rarely favoured aesthetics.

Twenty-three years ago I believed virtual reality was the immediate future of the Web. I did not understand that we were in the midst of a singular transformation into knowledge culture. I doubt anyone did. People had other things on their minds, other agendas to pursue, other problems to solve. The promise of VR — and visualisation more generally — withered on the vine.

So did this idea of permissioned space.

Or rather, all of it waited, patiently, until further progress into knowledge culture became impossible.

Virtual reality is the leading edge of a range of new techniques that blend the real world with the online world, knitting the multiverse together. Augmented realities like Pokemon Go mix them through a smartphone screen. Mixed realities combine the real and the online seamlessly.

In order to marry the online in the real worlds, we need a framework, a system — a protocol for doing so.

In that WWW1 paper, we called it ‘cyberspace protocol’. (Yeah, that was a hip word, back then.)

A generation later we’re calling it the ‘mixed reality service’ — and what it does is very straightforward: it binds coordinates to URIs.

Mixed Reality Service — MRS — is meant to be a near analog to the Domain Name Service, DNS. Just as DNS maps a namespace to IP addresses, MRS maps a coordinate space to URIs.

It’s important to note that MRS maps any coordinate space. They could be geospatial coordinates, or 3D dimensional virtual coordinates, or 4D space time coordinates, or what have you.

This means MRS can be used to permission virtual space — so visitors can’t run amok in your virtual world — but, much more vitally, MRS adds a missing metadata layer to the real world, adding links in space.

The real world has been silent. MRS gives the real world a voice.

What might the real world have to say? Anything relevant. MRS doesn’t specify anything beyond a valid Uniform Resource Identifier.

In the most likely use case, this URI points to site-specific metadata, but just as with the Web, that URI can point to anything. It might be HTML, but it could just as easily be audio or video or any other data. The Web doesn’t care. It just connects.

Let me walk through four use cases — and because this is a very technical crowd, we’ll go through some pseudo state-diagrams to show how MRS works on the client.

Autonomous vehicles & drones:

A drone uses MRS to discover whether it has overflight rights, or can use its camera.

Site safety:

Farmers can send pickers out to the field without worrying they’ll end up exposing themselves to pesticides.

Building information:

With MRS, anyone can get the directory of a building — without having to know its URL.

And back to where we started, AR Games:

A game that talks to the world with MRS can play nicely with others.

Mixed Reality Service isn’t a new idea, but it wasn’t until last year that we understood why the world must speak for itself.

Now that we’re about to have five billion people all walking around with smartphones equipped with GPS and connected to high-speed mobile broadband networks, we have exactly the tool that has exactly the need for the service that MRS provides.

With MRS, a smartphone in someone’s hand is exactly the point where the real world and the online world can knit themselves together meaningfully, helpfully, and enduringly.

An initial draft specification for the MRS protocol was introduced in October, at the first meeting of the W3C WebVR Community Group. With the support and encouragement of that working group and the W3C, last month we launched our own Mixed Reality Service Community Group, beginning the careful and deliberate process that will hopefully lead to a Working Group sometime toward the end of next year, together with a draft public specification.

But you can try the Mixed Reality Service today. Earlier this week I went around Perth CBD adding a few choice sites into the database of MRS coordinate-to-URI mappings. If you point your phone at mixedrealitysystem.org/demo, you can have a play for yourself. MRS is already here, it’s real and — with your help — it can only get better.

And we do need your help.

There are so many years of experience, so many domains of expertise represented in this room we’d be foolish to think we could go it alone. The whole history of the Web is proof that doing it together is the way to do it right.

So please, if you can, become a member of our Community Group. We need to learn from you. It will make our work better.

That brings us up to the present.

The Web has transformed the world, but stands aloof from it. It’s the work of the next generation to knit these two together, sewing them into a seamless whole. That will transform our world as much as the Web has already transformed us.

--

--

Mark Pesce
GHVR

VRML co-inventor, author, educator, entrepreneur & podcaster. Founded programs at USC & AFTRS. Columnist for The Register. MRS. Next Billion Seconds. MPT.