How Fung Fellows, Blue Goji and Microsoft Built a Virtual Tutor
The development of a Virtual Tutor for a Virtual Reality (VR) game originates from the Active Learning project of the Fung Fellowship for Wellness and Technology Innovations program. The Fung Fellowship is a unique entrepreneurship and innovations program that connects diverse undergraduate students — 1/3 engineering/computer science, 1/3 public health, and 1/3 other majors, including arts and architecture — with community and industry partners for a sustained health and technology experience. Because of its interdisciplinary nature and focus on real-world problems in health, the program attracts a diverse student profile including larger cohorts of women (64%) and first-generation college students (38%).
Blue Goji, a technology startup company focusing on gamification of health, is one of the supporters of the Fellowship program and is the corporate sponsor of this Active Learning project. The goal of the project is to study the potential effectiveness of using the “Active Gaming” technology, developed by Blue Goji, as incentives to motivate students in elementary and middle schools in underserved East Bay communities for improved classroom behaviors, and to help them develop better focus and concentration skills. The Active Gaming technology comprises a Microsoft Windows-based gaming environment supporting active 2D and VR games and custom stationary bikes modified by Blue Goji for young students.
The Virtual Tutor is a key component of this Active Learning project. Working with Blue Goji game developers and volunteers from Microsoft, including UC Berkeley Master of Engineering alumnus, Chun Ming Chin (Microsoft Search and AI program manager), the Fellowship students are creating a special, Active Learning version of an existing Blue Goji VR game, GoWings Safari.
This instance includes a Virtual Tutor in the form of a wise owl named Merlin. Flying with a “Wing Pack,” each student will tour the African safari in the air. Merlin will guide each student with fun facts about safari animals as they fly. Merlin will also help students practice their math skills as they conduct animal surveys. To make this unique experience truly immersive, the Fellowship team is aiming to adopt a speech recognition technology for a more natural, voice-based interaction between Merlin and young students.
Selecting the Right Speech Recognition Tool
Under Chun Ming’s guidance, the team tested and experimented with a range of speech recognition technologies from Carnegie Mellon University (CMU), Google, Amazon, and Microsoft. After the initial testing and experimentation, the team chose to focus on using Microsoft Bing’s speech recognition API’s.
Louis Huang, an EECS student taking the lead on the programming work, comments, “Compared to my typical class coding assignments that use open source tools and generic public data, this project really expands my knowledge and experience working with these commercial tools. More importantly, knowing how my work will be used to help kids from underserved communities gives me the ultimate motivation to overcome whatever challenges we run into. Together, we’ve overcome a lot of technical challenges!”
Nicholas Kao, an Applied Math student focused on the data analysis and collection work, confirms, “Bing’s cloud-based speech recognition tool has high recognition accuracy! However, the next challenge we’re facing is the 3-second delay in getting the result back from the cloud.”
Fortunately, Chun Ming is able to connect the students to another group of Microsoft researchers from its Boston-based Azure team at the New England Research & Development (NERD) Center. With access to NERD’s work on an Azure AI product known as the Data Science Virtual Machine notebook, the Fellowship students were able to achieve a virtual tutor speech recognition accuracy of 91.9%.
This is an improvement over the 85% accuracy when benchmarking against CMU’s open source speech recognition model. Moreover, the average model execution time is the same at 0.5 seconds per input speech file between NERD’s and CMU’s model.
An additional prototype Deep Learning model was developed by NERD based on a winning solution of the Detection and Classification of Acoustic Scenes and Events (DCASE) 2017 challenge. This model can push classification accuracy up even further and scales to larger training datasets.
The Fellows final phase at the elementary school included collecting voice training data from the middle school students and calibrating the model. Max Kaznady, a senior data scientist from NERD, shared, “This collaboration is a convenient way to test incubating AI tools from Azure and customize such tools to deliver greater evangelical impact to underserved communities and companies like Blue Goji.”
The team plans to wrap up its Virtual Tutor development work by the end of the summer, and deploy this Virtual Tutor-enabled version of GoWings Safari as a pilot VR game at a middle school in Fall 2018. The team hopes to observe how students are interacting with Merlin, the tutor, and fine-tune the game accordingly. Louis graduated from UC Berkeley in May 2018, and will pursue his Masters of Engineering degree at the Fung Institute in Fall 2018. Nick will graduate in May 2019 and continue to manage this project and support the middle school throughout the next academic year.
James Tayali, Blue Goji’s GoWings Safari producer, will continue to manage the project. As a Fung Fellowship alumni with a degree in Public Health, James is particularly interested in this research project. Growing up in Malawi as an orphan, James can identify with kids from underserved communities. He comments, “I know this project can help kids who have experienced many life challenges at young ages. I can’t wait to see the results!”
This is a win-win project for Blue Goji, the students, the university, and Microsoft. Coleman Fung, CEO of Blue Goji, comments, “The students have done an excellent job researching this technology for us. At the same time, they are learning a lot on a real-world application for this emerging technology as well as having the actual experience working with their customers — the kids from these underserved communities.” He continues, “After the pilot testing during the next school year, we’ll make a final decision on whether we will adopt this technology or not. But I am very cautiously optimistic.”