Google’s Project Tango — a look at how it could affect spaces and its possibilities.

Fabin Rasheed
Abstract Code
16 min readJul 10, 2016

--

Google’s Project Tango [1] offers a new mode of mapping spatial information into handheld devices. Using depth cameras and computer vision, the 3d depth and color information are fed into such devices and can enable real-time augmentation. This opens up a whole new spectrum of possibilities for leveraging real-time interaction and enhanced spatial perception. This would also involve creation of a new kind of virtually connected space, not far from reality. The impact this could have on people’s interaction and the scale to which such spaces can be utilized is profound. Here we look at some of the key aspects of this new intervention.

Depth sensing and 3d mapping has been around for sometime now. The popular Microsoft Kinect [2] has been a favorite among its users for making applications ranging from media installations to games. It brought about a new perspective on how people interacted with system. Suddenly, the spaces around them started to have life. Although Kinect did its fair share of embodying people into the virtual world, it did so in finite spaces and localised areas. Microsoft Hololens[3] on the other hand is similar to Tango. It augments virtual objects into physical spaces. But then again, Hololens being an HMD lacks the social virality, numbers and ease of use that comes with hand-held devices like a smartphone. Rekimoto et.al. [4] discusses how virtual objects can be tagged to spaces and how this can act similar to post-its in virtual spaces. The idea of situated computing comes into play in augmented reality which is very location and context aware. In their paper on Situated Computing, Hirakawa et.al. [5] discusses how situated computing should make more intelligent systems which cater to the needs of the users at a particular location or context. N. Avouris et.al. [6] discusses how virtual spaces can be leveraged to play connected, location based games, their narrative structure, the interaction modes that they afford, their use of physical space as a prop for action, the way this is linked to virtual space and the possible learning impact of the game activities. What we find is a general tendency towards augmenting the real world with space based information. But how much could this augmented space be used as a connected space? At what scale? What are the possibilities of using the space? How would be spaces categorized as private, public and group based? How would it affect physical spaces and human behaviour? These are some of the questions that are addressed here. The future possibilities of Tango are also explored here.

Project Tango works on mapping the world around with a mobile phone. Project Tango enables two new features: 1) Location based positioning of virtual objects, and 2) Movement through virtual space. The present technology allows to map indoors accurately. Now this would mean a cluster of spatial data of a particular area which has been tagged with the location. The obvious question that would arise then is would this mean integrating it with a mapping software and having a world map of the virtual areas? This could be the direction in which Project Tango is taking spaces. Since the technology will reach the masses through their handheld devices, the possibilities of mapping and having the data updated regularly would happen easily. Now this would mean a whole world mapped and stored in a cloud which can be retrieved at will. This could be a new form of interconnected location based content delivery platform. Another way to look at it would be an Augmented Reality World Wide Web. For convenience let’s call it the AW3. This introduces opportunities in many sectors. Businesses could be looking at a different kind of face for their stores after PC, tablets and mobiles. This would also come with privacy factors, ownership rules and collaborative decisions. Let’s look at these.

Ownership: AW3 would have virtual objects tied to physical (real-world) positions and objects. This would mean the ownership of such virtual objects may lie with the person who owns the physical space, or the person who created or bought the content within the space. As such, ownership of AW3 can be considered to be any of three sections: private, group or public. Assuming that each of the ownership can be changed according to preference, with permission, people can “design” each space according to requirements. Say, a house can be designed for the general public in AW3. The house would have elements pertaining to what the house owner wants to show the public. But this changes when it comes to the family living there. They would have a shared group space. Such a shared group space might have anything from family games to post it notes as reminders on a particular wall. As for private spaces, the space would be usually more functional in nature. Since the chances of a second person viewing this space is limited, the chances of better aesthetics would be arguable based on personal preference. Hence what we see are 3 different layers of augmented spaces. This kind of layers would also be very important when it comes to security of indoor spaces.

Delving deep into the concept of ownership would mean how value is associated with spaces and who owns them. Assuming a shared right of holding between the AW3 service provider and the actual owner of the real-space which is augmented, we can say that both the service provider and the owner will have the rights of publishing in the space. This would in turn entitle them to possess the value elements in AW3 which include paid usage, including advertisements. The fact that real world advertisements on properties involve physical attachments makes it rather unappealing to homeowners and private spaces. But when it comes to virtual advertisements in the AW3, we get to see more acceptance. The chances of having a static advertisement are lessened and would lead to interactive advertisements, mini-stores or portals to the stores. The difference between a virtual world like Second Life [7] or Entropia Universe [8] and AW3 is that AW3 is tied to the real space. While the former can have large and scalable areas, and the value for a particular area is tied only to the elements in the area, AW3 can have far more limited areas, ie. Since the real world is tied to AW3, all elements will be in relation to the real world. This would mean the value of the area will be determined by ownership, real-world value and the elements in the area. This could in-turn give rise to a virtual real estate market.

Privacy: Privacy factors involve restricting access to personal spaces, the real world data and group data. The data which would be in AW3 will not be restricted to just visual data but also audio data. People would want to restrict their conversations in the world along with visual data. This would involve a very good design challenge in managing how security levels at each location and each element is decided. Another aspect to look at in ownership are the collaborative spaces. This come under the group ownership category. Now this could be an in situ collaboration or a remote collaboration. Although such group spaces(and public spaces) are layered over each other, the person’s physical presence becomes an obstruction to simultaneous use of the space. This would involve the concept of space reservation. For example, if you have a meeting with a friend which would involve you doing interaction in an area, nobody else can use that area (plus safety threshold area), unless they are part of the interaction. Collaboration would involve spatial changes and element changes in each of the ownership section. A manipulated virtual object should be updated in every collaborators system. Local collaboration can happen this way. But when it comes to remote collaboration, the collaborators would require “remote similar objects”. i.e. if you are collaborating with a friend, and you have a table surface, the friend should also have a similar surface for interaction in real world with him. This would give rise to a new field of specialized real life objects.

As for “securing” things, it would not be just limited to encryption but position, and this would mean the user should be at a particular location to access data. This would in turn give rise to objects being kept, saved and lost. This would give rise to a whole new mode of search — which would not be just text, but design based- shape, color, texture, function etc.

Interior design and spaces: How Tango would affect spaces could also be seen at the level of interior design and interior space changes. The requirement of large scale group interactions and freedom of movement would require large empty spaces. This would lead to the creation of “free rooms” which would be devoid of obstructions. Another way to look at it would be that every interaction in a room could be intelligently designed to take obstructions into account. If we follow the former approach, people can be seen majorly moving, and when seated, it would be on the ground. It could also mean the arrival of dynamic furniture on the similar lines of MIT Media Lab’s inForm [9], but at a bigger scale. This would involve dynamic creation of furniture and spaces according to requirement. Another change would be the creation of real life interactive objects, the effects of which would show up in the AW3. This would be anything from a new furniture which can have virtual screen on touch, to a complicated sculpture which would involve changing physical properties to change the virtual configuration. Again this would give rise to a new kind of economy in object design.

Behaviour: The most obvious effect in human behaviour would be increased movement. Technology such as PC, smartphones and tablets have hitherto made people increasingly sedentary. With the arrival of augmented platforms which have objects attached to locations, the chances of movement would be increased. Especially when gaming, users will have an innate motivation to move through the spaces, depending on how the games deliver their contents. Interaction with objects could change the state of an object. This would mean a saved state visible to another person. This would give rise to more space based social interactions.

Another kind of behavioural change would be the “alternate world”. A completely different world designed on top of the present world, essentially leveraging spatial and physical properties of the real world and combining it with the design of the virtual world. For example, there could be an alternate world where everything is based out of a particular genre of music or art philosophy. There would be groups who would start spending more time in such worlds, the access to which will be limited to these groups, leading to alternate scenarios of life and society being played out. Although, such alternate worlds will be useful in areas like immersion therapy, the addiction factor and loss-of-reality in such worlds would be questionable. Another factor which comes into play with Tango is increased effect of presence. It would not be long before connected stationary cameras with depth sensors continuously maps spaces. This means that people and virtual objects would not be just a 3D projection on a screen but the space around them are navigable, i.e. we can look at them from different angles and hear them from different directions. The increased feeling of presence would to an extent increase intimacy. This would mean more meaningful interaction between people remotely. Increased feeling of presence would also tie to emotions like fear, apprehension, excitement etc., which could be leveraged in games. Thinking of social networks, increased presence would lead to people sharing full experiences rather than just pictures. People would want a friend to be part of the experience rather than just be a remote viewer, leading to highly interactive social networks. It would be interesting to look at the “selfie” tradition changing into a “fullie” tradition.

Movement of people is yet another perspective to look at. As mentioned before, chances of people moving are increased as well as decreased. Increased in the sense that objects are now location specific and the continuity of connection and interaction would essentially involve movement. On the other hand, ability to bring scenarios, and alternate worlds projected into a specific location would lead to change in the home-office differentiation, further reducing them. Movement thus on a global scale would decrease, but on a local scale would increase. Another interesting behavioural change would be randomness in movement. The location-specific connectedness of objects would lead to more randomness in movement of people when viewed as a whole. An example would be that people in malls might not follow the regular pattern of movements, but would have more stops, turns and turnarounds as well as looking-ups and looking-downs. An effective implementation of the virtual world can direct attention to hitherto unseen areas. Group behaviour would also mean, people using the same application would be guided in the same directions, essentially leading to flocking of people while moving.

There would also be the rise of virtual living beings. These could be animals, avatars, creatures etc. What makes it different from the virtual worlds so far is interactivity. Soon, hands of the user will be mapped into 3D tracking. This would mean interactions with objects would be spatial. On a location based collaborative linkage, multiple people can interact. This would essentially mean an object oriented approach to designing such living beings, even other object. Machine learning techniques would be employed to learn real-life interactions of animals and duplicate it into the real world. Such intelligent beings, would essentially learn behaviours and interactions from users collaboratively. Another way to look at it is the being will be ever present in the virtual world and it becomes active only when the observer looks at it. Also, the being can learn from the virtual world as well as the real world, leading to its on evolution. Situated feedbacks would also mean change in behaviour and content of virtual elements with place and context. This could be anything from a “shivering dog” in a cold place to lesser distractions and situation aware alerts in hospitals or while crossing a road.

Another aspect of change in behaviour would be using the concept of replay. Replay essentially means replaying a scene which occurred previously. It could be a conversation which happened or it could be a football game which one could not watch. The replay could be at any angle at any scale. This would mean reliving a past occurrence in a more real fashion. This would originally lead to a privacy scare. But, there is also the chance of editing moments of the replay such that events relevant to the user remains. This would also mean users re-experiencing events, reinforcing as well as giving better perspectives and judgements to events. This would also mean better attention to finer details. All this could lead to repeated viewing of pasts, revised conversations with people, change of previous agreements or decisions, and also better personal feedback on events. This would hence lead to better decisions, may it be social, judicial or others.

Rise of new jobs and demanding old ones: A new set of jobs will be on demand in the market. One of them would be an Augmented Experience (AX) designer. Yet another would be Augmented Architects and interior designers. All these professionals will deal with how things can be designed for the augmented space. Their major constraint and factor-of-focus would be the fact that the AW3 would be tied to real space. They would design experiences for the AW3, which would involve many factors like, depth perception, freedom of movement, networking and collaboration, 3D aesthetics, viewpoints and perspectives, motion, elements of attraction and distraction etc. A new challenge would be to design UI in space which could derive from elements from 2D-UI yet caters to real world interactions and dynamics. A new kind of responsive design would come into place, which wouldn’t be just 3D version of a 2D-UI but a surface aware, context aware, intelligent version of the 2D-UI. 3D artists and sound designers would have a lot more in their hands now when it comes to creating content for the AW3. The challenge for Design Directors would now be to think from multiple Point of Views (POV’s) and enable consistent experience. This would also mean the development of a visual language, a platform for consistent 3D experience and control. A new kind of Document-Object-Model(DOM) would have to be developed, one which takes into account 3D objects and interactions, and is scalable and searchable. This DOM should ideally be extensions of popular DOMs presently used like the HTML5 DOM. A new set of programmers would be in demand, the ones who can code for space. This would include game designers, and 3D front end developers. At the back-end we would see people working to link and interconnect the AW3, and also interconnect the Global Augmented Map, which would be a result of the AW3. This would mean a whole new array of services from Virtual Real Estates to Render farms. Each location in the public layer would be affected by server speeds catering to the content-provider of that location. On the contrary, an ideal situation where every content is distributed over all connected devices should also be considered.

The field of entertainment would benefit hugely and creatively from the replay factor of AW3, esp. movies. The new movie-theatres would be physical props and interactables. People would be viewing a movie from different angles and viewpoints leading to more effort and intelligence to be incorporated into everything from acting to CGI. On the receiving hand, the audience would be getting a POV-based-movie, which is more immersive and can also be interactive. A director would now need to think of a space as a whole and the viewpoints of the viewer from every possible location within the action area. This would also mean new ways of directing attention, using principles of contemporary First-Person-Shooters (FPS) and Role-Playing-Games (RPG).

Objects, surfaces and interaction: Objects in the augmented world would be of different types based on its interactability. These could be stationary objects, floating objects, objects with physical properties (gravity, elasticity etc.) or dynamic objects. Interactions would be highly dependent on technology and how fine the details of the physical body of the user can be detected. A new level of hierarchy for elements can be given by depth (nearer more important than further). An additional layer of focus and blur could help give better targeting by using the look-at-ray of the device. A new kind of interaction would be to move objects between the AW3 and the device. i.e. an object may be picked from a local storage in the device and dropped in the AW3. Since each object would have different properties, this interaction would differ for each object, i.e. an object obeying gravity may be dropped, while a graffiti on a surface would require defining the surface and remapping the graffiti on it. This would also mean highly application specific interactions and affordances. As for information providing content, like text chunks and photographs, a whole set of interaction possibilities open up. Every surface becomes a potential carrier of text or image, and this would give rise to a whole new set of design opportunities. Intelligently identifying surfaces would give better control over large scale design. Objects in real life mapped using Project Tango could be tagged in the future. This would help identify interactions pertaining to that object in the virtual world. For example, a user could tag a sitting area as a soft couch. This would mean virtual objects could push the texture of the couch deeper and will have elastic properties. Surface would also enable virtual living beings like virtual pets to access such surfaces. Providing surface data, and object data as tags to such virtual living beings could make their interactions more intelligent. Eg. A cat will recognize a bowl of milk. Interesting to note here is interlinking interactions and enabling a control of a device from another device. A very good example for this is the Reality Editor project [10] of the Fluid Interfaces group of MIT Media Lab. Design decisions that have to be taken while designing for the AW3 would also include whether to have the same view of a particular interactive element to everyone viewing the element (using the ray cast by the user), to include simultaneous interactions of elements, should there be simultaneous full-body presence of users (like a concert), should there be a toggle for such physical presence etc.

Search: Search in the AW3 would involve anything from context aware visual search to object search or even location search. The straightforward advantage of visual search, is to find a location in the world based on an image. This could be helpful in finding the next best game arena or could be identifying an unknown location used by a fugitive. Users will also, more actively start using “similar search”. In similar search, unlike conventional search, user might be searching using a portion of the AW3 world on-screen. The search engine would then have to intelligently identify elements, interactions, contexts, text-visual-audio data etc. and provide locations with similar entities in the AW3 as search results. Assuming the new kind of IP addresses would be based on GPS locations, layer (public, private or group), and context, search could point directly to a particular location, eg. “Chin Lee’s kitchen cabinet”. The presence of more objects and interactions would require content creators to tag such objects and interactions so that text based search would be easier. On the other hand, users would want to search using a gesture, a sketch or an object, besides using text or voice. Such search would make more sense in a virtual world than present desktop searches. Displaying search results would again be an interesting challenge. This could be grouped based on the result as object (includes 3d objects, images, videos), location previews, or interactions (gravity and physics present or absent, fluid motion etc.)

Conclusion

Project Tango brings a whole new spectrum of changes in how we perceive the digital world. This brings a large number of opportunities along-with it as well. Spaces will change according to large scale usage patterns, and will be seen converging to different niche areas. While privacy and security would become a concern, ownership would play an important role in how spaces, society and businesses would adjust to the AW3. The AW3 would also be an important shift from regular websites and would prove to be a whole new area with high creative possibilities. Movement of people and social patterns can also change by and large. Project Tango could mean the next big step towards a paradigm shift in how society would move to a more virtual world. Once large scale mapping and content creation takes place, the physical world could be easily become a playground of possibilities. Whether it would mean human detachment from the real-world completely into the virtual or would it mean a highly efficient support system to the present real-world (re-emphasizing the real-world), is something which time would tell.

References

[1] Project Tango, https://get.google.com/tango/

[2] Microsoft Kinect, https://en.wikipedia.org/wiki/Kinect

[3] Microsoft Hololens, https://www.microsoft.com/microsoft-hololens/en-us

[4] J. Rekimoto, Y. Ayatsuka, K. Hayashi, Augment-able reality: situated communication through physical and digital spaces, Wearable Computers, 1998.

[5] M. Hirakawa, K.P. Hewagamage, Situated Computing: A Paradigm for the Mobile User-Interaction with Multimedia Sources, Annals of Software Engineering 12, 213–239, 2001.

[6] N. Avouris, N. Yiannoutsou, A Review of Mobile Location-based Games for Learning across Physical and Virtual Spaces, Journal of Universal Computer Science, vol. 18, no. 15 (2012).

[7] Second Life, http://secondlife.com/

[8] Entropia Universe, http://www.entropiauniverse.com/

[9] inForm, http://tangible.media.mit.edu/project/inform/

[10] Reality Editor, http://www.realityeditor.org/

--

--

Fabin Rasheed
Abstract Code

Artist | Designer | Technologist | Inventor | Writes about art, tech, philosophy and spirituality