Mixing Realities & Language Learning in the Wild

Seamlessly blend what you live with what you learn.

Published in

MIT MEDIA LAB

7 min readSep 28, 2017

Learning a second language is a journey that typically begins in the classroom. However, it’s not until we get out into the world that we validate our success (or failure) at learning by employing language to accomplish our goals. This transition, or rather transfer of knowledge, from a realm of conventional (curriculum-ready) interactions to an unpredictable reality comes with its own challenges. The language-learning classroom is (hopefully) an environment where we feel safe to make mistakes when we communicate with other learners. The real world, on the other hand, is not nearly as forgiving. When you’re up next in line to order a cup of coffee in a non-native language during rush hour, it’s easy to freeze up, make mistakes, and end up with the wrong drink! Expressing oneself becomes a challenge of finding the right words to transmit the right set of ideas in ever-changing situations. In a way, there is a disconnect between the context in which people learn, and the way that knowledge is catalyzed in real-life scenarios. (In fact, this is not an exclusive dilemma to language learners, but to learners in general.)

Many opportunities to learn a new language surround us everyday. Be it in the music we hear in the streets, the TV shows we watch in the comfort of our homes, or the people we interact with at work — we come across tiny bits of knowledge that we could employ to our advantage in our quest to master a new language. Wouldn’t it be great if we could learn Spanish while we enjoy the latest episode of Game of Thrones?

Unfortunately, these ubiquitous learning experiences tend to pass us by. Sometimes we lack the minimum expertise (e.g. vocabulary) to grasp them. Others are so nuanced and swift that are gone in the blink of an eye. In general, we don’t have the tools we need to leverage such experiences and turn them into instances of serendipitous learning. However, luckily for us, mobile technology has become an ever-present agent that can mediate these fleeting moments, allowing us to take advantage of these learning experiences that happen around us and bridge the gap between where we learn language and where we employ our new skills.

Mixed reality (MR) offers a promising way to mediate these learning opportunities on-the-go by projecting onto the learner’s physical reality. Think Tony Stark’s Iron Man suit and its ability to augment his environment with visual cues that allow him to swiftly address any situation. The core motivation behind MR as a medium for learning is simple and powerful: How can we extend the world around us with information and interactions that enable situated learning outside of a classroom?

The team that worked on the WordSense platform for language learning in mixed reality using the Microsoft HoloLens: Takako Aikawa, Megan Fu, Afika Nyati, Christian Vázquez, and Alexander Luh. Credit: Megan Fu

Imagine learning a new language while you move about a new city. You’re already set up with a MR headset that, unlike Google Glass or its predecessors, doesn’t quite make you look like a futuristic hipster. This device is aware of your surroundings and is powered by the latest in artificial intelligence. That means it knows the words and phrases you’ve encountered before, because you’ve worn it for the last couple of days in your exploration. Whenever you encounter something new, the heads-up display pops up a novel word in your target language. Now you have a choice to make: walk by it, or pause to interact with this new tidbit of knowledge — hear the word out loud, learn how to use it according to your current context, and perhaps connect it with prior knowledge, all with just a couple of gestures in the air. The next time you encounter a similar scenario, the MR device might cue you about it or remind you of the knowledge you’ve already accrued. It hears you interact with the locals, and it can give you feedback on the spot or suggest conversational cues. It lets you explore the richness of your experiences, weaving learning into your daily activities.

The aforementioned scenario is not too far into the future. In fact, we recently developed a prototype application on Microsoft HoloLens that allows this kind of interaction with your surroundings. WordSense leverages advances in computer vision to understand the user’s environment, to the extent that it can make actionable suggestions that are useful for learning. More than allowing for some really cool moments of “serendipitous” learning in the wild, we are using WordSense to explore using situated holographic content to create strong, meaningful associations that truly exploit the affordances of mixed reality as a tool for education.

*WordSense allows holographic glosses with contextually relevant information about newly encountered vocabulary words in the user’s environment. Credit: Christian Vázquez*

Within the context of language learning, a gloss can refer to the use of annotations within text that provides the definition of a certain vocabulary word. A gloss might also contain example usage or multimedia representations of the concept represented by the word. The effect of glosses on the retention of newly encountered vocabulary has been explored thoroughly. Studies have shown that embedding glosses or annotations in multimedia formats (such as images or sound clips) helps people learn new words within relevant contexts. Many smart devices today (e.g. Kindle) provide services that do this, to allow readers to expand their vocabularies as they enjoy the latest works of fiction. This positive effect on vocabulary learning can be framed within dual-coding theory. Dual-coding theory proposes that information is embedded in both verbal and nonverbal representations within our brain. These representations can be bridged by creating associations between the verbal component of a concept (e.g. the word “car”) and its visual analogue (e.g. an image of an old Volkswagen.) In theory, the more associations you create between both representations, the more likely you are to remember it in the long run. Existing mediums already exploit the power of strong visual associations to help learners achieve their goals (think of children’s books, which are full of pictures).

*Picture taken from HoloLens that shows objects identified by the WordSense platform and labeled in the target language. Credit: Christian Vázquez*

This is an example where mixed reality can shine. When information is embedded in our reality with a visual synchrony that achieves a strong association between object and holographic overlay, how much more powerful does it make this gloss effect, compared to a simple image? An image is a two-dimensional representation of the object associated with the concept or word you are learning about. With mixed reality, you could think of the real-world object as a collection of hundreds of images (depending on how you look at the object.) Does this afford a stronger level of association than other platforms? I think the most important questions around mixed reality as a platform for learning at this stage should follow a similar axiom: How is this different from anything I can do already with my phone?

For better or worse, mixed reality has become a buzzword in the tech scene. From researchers in academia to the venture capitalists of Silicon Valley, the words “mixed reality” get hearts pounding. And why not? The technology is fascinating (and nobody seems to forget that iconic scene in Minority Report, where Tom Cruise swiftly sorts through holograms left and right). This “hype” is not dangerous by itself; however, when people ask how MR is more than a “gimmick” for learners, the words “engagement” and “contextual” reverberate between empty phrases like, “It’s better than a phone because you don’t need to take it out of your pocket.” Instead of exploring the unique, meaningful pedagogical advantages of MR, too many otherwise thoughtful people are interested in mixed reality for the sake of novelty, rather than the sake of learning.

Another concern about MR is related to how its potentially negative effect on the way we learn. A study published in Science found that having a smartphone can decrease our ability to recall information. What happens when the information is always there, ever-present, and seemingly evoked out of thin air for our convenience? Where is the line drawn between a tool for learning and a holographic cheatbook? How might we have to redefine our common understanding of what learning is or looks like?

It’s my personal belief that the difference lies in how we develop these technologies and think about integrating them with instances of more structured learning. The interplay between what happens in the world, where you are learning with the mixed reality device, and how the teacher engages with the serendipitous experience to make it meaningful, lends itself to a more dynamic class and more personalized experiences for each student. This interaction between mixed reality experiences and structured classes could allow teachers to focus discussions, understand student progress, and even correct the system when shortcomings in the technology surface.

One thing is clear though. Mixed reality’s potential lies in its capability to seamlessly blend what you live with what you learn. It is up to us as pioneers in this area to make a conscientious effort to put learning objectives in the forefront of the design process, to think about mixed reality as something more than a shiny toy, and to focus on the unique affordances of this platform and its potential impact on learning.

Christian David Vázquez Machado is a research assistant in the Fluid Interfaces Group and a ML Learning Fellow.

This post originally appeared on the MIT Media Lab website.

Mixing Realities & Language Learning in the Wild

Seamlessly blend what you live with what you learn.

Written by MIT Media Lab Learning Initiative