We’re all familiar with iconic scene in ‘Minority Report’, but not so many people know that the technology behind the gestural interface that Tom Cruise uses to sift through data is very much based in reality, and it’s set to change the way we all interact with digital environments.
John Underkoffler is the designer and engineer that led the team that came up with this interface, called the g-speak Spatial Operating Environment. His company, Oblong Industries was founded in 2006 with the goal of creating the next generation of computing interfaces. Oblong are currently focusing on Mezzanine, an immersive visual collaboration solution for the workplace. But the big vision is ubiquity: Mezzanine on every laptop, every desktop, TV, dashboard.
Before founding Oblong, Underkoffler spent 15 years at MIT’s Media Laboratory working in holography, animation and visualization techniques, and building the I/O Bulb and Luminous Room Systems. John has been science advisor to films including “The Hulk”, “Aeon Flux”, and “Iron Man”, and serves on the board of the E14 Fund, the board of Sequoyah School, and Cranbrook Academy’s National Advisory Council. He is the winner of the 2015 Cooper Hewitt National Design Award for Interaction Design.
Tell us a little more about Mezzanine — where did the idea come from? What sorts of companies are currently using it? And how much was it inspired by the “Minority Report” interface?
In one sense, Mezzanine is a product that’s the distillation of nine years of work and design and thinking and engineering here at Oblong. And to an extent, it continues the work that moves through “Minority Report”; it’s part of that continuum. Mezzanine is a really strong expression of what Oblong is, and what I believe is most important in a computational experience of human-machine interaction. But from another point of view, it literally is those scenes in “Minority Report” made flesh.
So people fixate — when they write, think or talk about the film — about the gesture in the computing scenes with Tom Cruise and Neil McDonald and Colin Farrell. You see them using these big, pixel-rich, gesture-based computing environments. People fixate on the hands, and while that’s important, what is even more important is what they are doing. Those scenes depict people with a time-critical task working together and using an interface that’s like an ‘exo-skeleton of intent’ to sift through mountains of data. So the UI makes it possible to solve this problem, and to solve a problem so complicated you couldn’t solve it today using conventional tools. That, at the core, is what those scenes show. And that’s exactly what Mezzanine is: Mezzanine is a giant pixel space that lets multiple people work together in a hyper-visual collaborative environment, and literally solve problems of a character and at depths of complexity that you can’t get at any other way.
One of the things that’s truly exciting for me and for the team at Oblong is the breadth of our customer list. Our customers are not just clustered in one domain or one industry vertical. It really stretches across the entire spectrum. Which was the point: Mezzanine is intended to be that next generation collaborative computing experience that works for whatever you’re doing. So at one end of the spectrum we’ve got companies like Dentsu Aegis, connecting together locations from New York to Singapore to London where creative teams are working, often in real time, on richly visual information and highly refined and highly subtle stuff. So that’s the literal kind of synthesis and creation of visual information. We’ve got Mezzanine used in educational situations. We’ve got customers like Schlumberger using Mezzanine in the context of energy and oil and gas refinement. We’ve got customers in the financial services world. We’ve got customers in big, heavy manufacturing like GE and Boeing. We’ve got customers who are in the consulting world, where the value they bring to their customers lies in the human-to-human transaction. (It’s in the transduction of ideas and connection of expertise, much of which is often sitting with the client but which the consultant has to kind of coax out and connect together.) And so in all of these contexts, and literally dozens and dozens of different business verticals, Mezzanine is that universal medium that unites teams, connecting their brains by providing a consensual and real-time view, at large scale, of information and work efforts as they evolve.
“To imagine that an environment like this (let’s call Mezzanine a set of pixel behaviours) would only be available in the workplace makes me want to weep. If you work all day in this environment — you’re throwing ideas around and bringing up applications and data sources and connecting together with people you’re working with or playing with across big distances — and then you have to go home to your extremely stupid display device, that’s really frustrating. So there’s no reason at all that in 2, 3, 5 years you won’t have this at home; we’re going to bring it to you.
What have been the biggest challenges during the project?
Well, we could certainly talk about UI design. All the techniques that people have built and evolved over the last 35 years, the desktop computing approach that goes laterally into mobile computing and then on into touch surface stuff a little bit… through all that the UI has remained essentially unchanged. It’s just better refined. All of those techniques aren’t necessarily applicable when you’ve got not just one screen, but as many screens as you want; when you’ve got not just one person interacting with a machine, but as many people as the room holds; and not just one room but as many rooms as you care to bring together. So at Oblong we’ve invented new UI techniques, but that’s not the hard stuff — that’s actually the fun stuff. The hard stuff is getting the world to understand this new thing. Of course, the message is really simple: now the pixels on your walls behave like the rest of the physical world: anything you could understand to do with a whiteboard, or with tacking photos up on a cork board, or something like that — your pixels work like that. But it’s a new category and it’s a new way to think about computing. And we are creatures of habit. As human beings, it’s really hard for us to move across the boundary of comfort into new modes of behavior even if they are hugely beneficial, even when the new modes are actually fun to use and highly rewarding, like Mezzanine is. This isn’t a unique problem; it’s the same thing that happens anytime anyone brings a radically new product or product category to market. That really has been the biggest challenge, but we’re largely past it now which is great.
When you were working on the “Minority Report” interface did you know that it would become a real product?
If there was any hint that it might become a real world system it was only in my head. The brief for the film was very simple: we needed to show people in a credible / believable way what interaction with the computer might look like in 50 years. But that’s a very important idea. Yes, it was ‘just’ for the purposes of a film, but there was a very prominent director insisting the result had to be believable. That’s both a heavy burden and an unbelievable opportunity. This insistence from the director gave me permission to work on that fictional design problem as if it were a real product design problem, because in order for it to be comprehensible, believable (and maybe even desirable) for audiences it would have to have the same properties that makes something useable and comprehensible in the real world.
It’s actually a really valuable prototyping process, in a way. Imagine this has to go on a film, imagine that you’ve got 5 seconds to show someone a new product, interface, or a new interaction, a new way of understanding data, a new kind of social medium — whatever it is — and you’ve got 5 seconds to show it. That’s the brief. So even if it doesn’t end up in the film, what you have is a really good start on a product, or a piece of a product; it’s a different kind of a discipline, a different point of approach.
What was the process of creating that original “Minority Report” system like?
I flew out to Los Angeles from Boston on US Thanksgiving Day 2000 — so roughly the 24th of November — and got to work the next morning. We worked through all of 2001 so it was really a year. There were intense periods of focusing on one element or another, like what’s the hand language, that was maybe a month of design and refinement, then it was another week to produce a dictionary that was a training manual. There was another two weeks to create a training video, then it was a week of personally training the actors and then it was a week of shooting.
We have seen a lot of gestural UX experimentation within the gaming industry, are there any other industries that are pushing the boundaries in this space?
It’s actually a bit of a shame that gaming isn’t more studied for that, and although they made a valiant run at the gestural thing I think there’s only partial success. I’m much more interested in the amazing thing that’s possible to do in a huge variety of games with a simple hand held controller. There are a few things at play here: First is that the users of the system are highly motivated to become experts at the UI. Second is that it’s possible to become experts at that UI. When you use a mouse there’s only so much you can do, how good can you get at a mouse driving Microsoft Word?. And it’s neither a fault of the mouse nor of Microsoft Word; it’s just that a closed system can only go so far. But in the gaming world people have figured out 6 degree of freedom navigation. They’ve figured out targeting. They’ve figured out all sorts of causal connections where you go around and connect this thing to that thing. You shoot at a space hole in that wall, and another one in this wall, and you go in there and pop out there. All this stuff has been designed in a way that people can learn, and what results is that you become way more sophisticated than you would be with a general use computer. So I wish more people studied the video game stuff because it’s where the best UI design and the best UI results, viewed in a certain way, have been achieved over the last 25 years.
What’s next for Oblong?
In a way the most exciting thing we can imagine at Oblong, and not just imagine but it’s what we’re working toward, is ubiquity. We talked earlier about a world where you would have Mezzanine on your TV at home; of course you would! If it’s good here at work, it’s even better where all of your personal stuff is. And that suggests a kind of spread that goes everywhere. So I think at the end of the day, for me, there is no reason to think that every device in the world — the devices you carry with you, the ones attached to walls, the ones that are in professional environments, the ones that are in personal environments, the ones that are in public environments — all of those displays should be connected together by a new kind of UI, a Mezzanine-like UI.
When our work at Oblong is done you’d be able to walk up to any screen in the world and let your work, your data, your pictures, your everything flow on to it, and join in with other people and use those pixels all together and in parallel. The idea is to detach the pixels from a limited idea of what computation is. Right now we literally weld pixels to a computational surface and a UI goes with it. That’s it. The mouse cursor can’t get off the edge of the screen because there is nowhere for it to go and the screen can’t get off the CPU because, even though logically they are separate, they are physically combined and our idea of what they do together is fixed. But we can break that open. The most important thing in the world is the pixels: that’s where the information resides, that is where the interaction is going to be localised, so let’s adapt a radically more free (but disciplined) view of how these pixels work, what they can do, and who can use them.
Check out the #InsideViews of Oblong Studio HERE