Voice Enabling Your React App — Add a Voice Assistant With Alan AI’s Platform

Roanak Baviskar
The Startup
Published in
10 min readJul 24, 2020

Alan is a complete Conversational Voice AI Platform that lets you Build, Debug, Integrate, and Iterate on a voice assistant for your application.

Previously, you would’ve had to work from the ground-up: learning Python, creating your machine learning model, hosting on the cloud, training Speech Recognition software, and tediously integrating it into your app.

Alan Platform Diagram

The Alan Platform automates this with its cloud-based infrastructure — incorporating a large number of advanced voice recognition and Spoken Language Understanding technologies. This enables Alan to support complete conversational voice experiences — defined by developers using Alan Studio scripts, written in JavaScript. Alan integrates voice AI into any application with easy to use SDKs.

To show you the power of the Alan Platform, we’ll start with a React Example App and add a few simple features to create a multi-modal interface. The React application we’ll be working with is a Hacker News Clone, available on the React Example Projects page.

You can also refer to this repository for a completed result. If you are unfamiliar with JavaScript or React, you can clone this completed repository instead of the one below to follow along. The completed Alan Script is inside, called Hacker-News-Example-Alan-Script.zip.

To start, clone the example project and make sure it works correctly:

$ git clone https://github.com/clintonwoo/hackernews-react-graphql.git
$ cd hackernews-react-graphql
$ npm install
$ npm start

Now that we have our app saved on our computer, we can start with the voice script. Remember where this is saved — we’ll need to come back for it later!

Building Your Alan Application

First, sign up and create an Alan Studio account:

Next, log in to Alan and you’ll see the project dashboard. Here, we’ll click Create Voice Assistant and select the Hello World Example:

Choosing this includes the SmallTalk script, meaning our voice assistant can answer simple questions. You can delete the Hello_World File, and create a new file called Hacker_News_Example. You can name the project something similar.

To understand the Alan voice scripts, there are two essential features we need to know — intents and entities.

  1. Intents — the phrases which we want to recognize — phrases like “What can I do?” or “What is available?”
  2. Entities — the keywords in these intents. Product names or category aliases, for example, would be important specific words that are relevant to the functioning of the app.

In these scripts, Alan supports advanced language definition tools that can be used to make intents and entities of any complexity. Entities like lists loaded from databases or fuzzy entities are critical in many different use cases and can be handled by Alan’s advanced dialog management system.

Alan also supports UI navigation with voice commands — enabling users to navigate through the different screens in an application and creating a seamless user experience.

More information about Alan Scripts can be found in the Alan Docs Server API Section. I’d recommend taking a glance before continuing.

Now, lets get started on our script. We can begin with navigation of some of the tabs on top of the Hacker News Page:

We can consider a few ways users would try to navigate to the New page, and use Patterns to define our intent. We also have to notify our application of the user’s request to change page, and do that by sending the application a command.

Let’s get this working with our application before adding more voice scripts. First, navigate to the directory you cloned the Hacker News repository, and open it up in your preferred environment.

Alan’s Client API documentation can be found here. To connect the Alan Platform to this app, we need the Alan Web SDK:

$ npm install @alan-ai/alan-sdk-web --save

We will be making a majority of our changes within the following file, since it has access to all relevant data and components:

/hackernews-react-graphql/src/components/news-feed.tsx

When we initialize the Alan Button in the app, we will need to do it within an Effect Hook, to ensure the page has rendered. We can use the React.useEffect() function within the functional component NewsFeedView to achieve this. We will also need to import APP_URI to route the user to the new page. Add the following code to your news-feed.tsx file, placing the useEffect function inside the NewsFeedView component:

Your Alan SDK Key is available by clicking the Embed Code button in your Alan Console.

We have set the Alan button to appear at an element with ID “alan-btn”. We can add the following line of code in the active.tsx file:

<div id=”alan-btn” />

Using Slots, we can extend the new page command to the following intents: Home page, Show HN page, Next page, Comments page, Ask HN page, and HN jobs. Your active.tsx, Alan Script, and news-feed.tsx files could look like the following:

We should also allow the user to ask Alan what the news is today. Though there are many ways we could structure the voice experience, a simple format would be Alan reading 3 headlines and asking the user whether they want to continue or read an article.

We can set this up by initially sending the Alan server the list of headlines to be read by using the Visual State. We insert the first piece of code within the React.useEffect() function, after defining the Alan button. It would also be helpful to allow the user to open any news article by the number. We can account for page number with the logic in the next piece of code:

We can then access this data within our Alan Scripts. We can set up the 3 headline reading logic using a Context, and add the open article intent as well:

We’ve now added a simple voice interface to the app. The next step is to add some visual feedback to the voice commands, such as highlighting articles as Alan reads them. Let’s use a Material UI Box to highlight the articles. To get started with this component, install the Material UI package with the first command in your terminal and add the next line to the top of your newsfeed.tsx:

$ npm install @material-ui/systemimport Box from '@material-ui/core/Box';

To make the Box appear on a state change, we can add the Box to every news element and edit the style to make it appear. Since we have all the headline titles in our Alan script, we can send the title of the article we’d like to highlight. In our app, we will have to set up a title state like so:

const [title, setTitle] = React.useState(
'Placeholder'
);

We can then reference this variable when setting the style of our Box. To insert this logic in newsfeed.tsx, we will have to change the format of the .flatmap function which helps render the News items. Specifically, by formally defining the arrow function, we can compare the news item’s title to our title state and edit the style of the Box. We add the Box component around the NewsTitle and NewsDetail components, and edit its style using a defaultProps variable.

After these changes, our .flatmap function should look similar to this:

Now, we just need to set the state whenever Alan reads an article title. We can add the following command in our Alan script whenever Alan reads a headline. Make sure to place this

p.play({"command": "highlightTitle", "title": headlines[i]});

We can then add following command to our Alan button to create the highlighting animation.

if (commandData.command === "highlightTitle") {
setTitle(commandData.title);
}

We now have the beginnings of a multi-modal experience set up!

Versioning

Alan supports versioning for development, testing, and production — helping you easily manage the process of adding the voice experience to your application. Publishing a new version is automated in Alan’s back-end and will automatically link to all production devices, without requiring any manual deployment.

Our script here is saved to Development and Last (the only editable version). After debugging, we’ll save our voice script and move it to Production. Let’s name this script “V1” and select “Run on Production”.

Take a look at this demo of how far we’ve come:

Just the simple addition of highlighting text as Alan speaks has completely transformed the UX, engaging your user in your app with little effort.

Now, we can move to debugging our voice experience in the application.

Debugging

Alan provides a host of features that makes debugging easy and efficient.

For example, a user may not know how to use voice commands, and might ask something such as “What do you do?”. Try asking this to your app.

You should be able to see that there was an unrecognized input in your Alan Studio. First, navigate back to the “Last” version so that you can make changes to the Alan script.

Let’s add a question to explain to our users how to use the voice interface.

const howDoesThisWork = [
‘How (do|does) (you|this) work?’,
‘What (do|does) (you|this) do?’,
];
question(howDoesThisWork, p => {
p.play(‘You can tell me to read the news, or ask me what the news is today. To navigate pages, say Take me to, Bring up, or Go to whichever page you would like.’);
});

Even as your app gets more users, you can ensure they have a good experience by updating the script as necessary. Lets push this new version to production, and name it “v2”. Make sure you click the “Run on production” button.

If you try your app, it will now recognize the questions we just added.

With other voice-enabling software, this type of functionality doesn’t exist. With Alan, you can gauge user feedback and update the intents in your script to make your app more reactive and intelligent.

To take advantage of this full iteration experience, let’s finish off by building some automated test cases for our scripts in the future. Click the “Test” button in the top left corner.

Let’s add some automated tests. We can add test cases for the “Read the News” command with the following visual state (make sure your app is running when you run the test cases):

{
"headlines": [
"When Hubble Stared at Nothing for 100 Hours (2015)",
"I’m Peter Roberts, immigration attorney who does work for YC and startups. AMA",
"To Get More Replies, Say Less (2017)",
"Indian IT consultancies struggle against technological obsolescence",
"How ftrace was able to brick e1000e network cards (2008)",
"FAA issues emergency directive on 2,000 Boeing 737 NG Classic planes",
"Unofficial Guide to Datomic Internals",
"Amazon Warehouse scam: 16TB HDD swapped for 8TB, returned for full refund",
"Reporters are leaving newsrooms for newsletters",
"Emulating Nintendo Switch Games on Linux",
"Show HN: How I made simple Geolocation service which handles 6m+ req/mo for $5",
"Cold Showers: For when people get too hyped up about things",
"We’re treating self-improvement like a software upgrade",
"Editorial board of Index and more than 70 staff members resign",
"NIST’s Post-Quantum Cryptography Program Enters ‘Selection Round’",
"The Covid-19 pandemic is forcing a rethink in macroeconomics",
"Show HN: Minimal, easy, and affordable, session Replays and Heatmaps",
"Apple has started making the iPhone 11 at the Foxconn plant near Chennai",
"Shadow attacks: hiding and replacing content in signed PDFs",
"Mail for Good",
"The Four Quadrants of Conformism",
"The Eastland Disaster Killed More Passengers Than the Titanic and the Lusitania",
"Best Data Science Books According to the Experts",
"Erythrocyte omega-3 index, ambient fine particle exposure and brain aging",
"The Cincinnati Privy Disaster of 1904",
"Ask HN: Is it just me? why is “news” so addictive?",
"Ludic Fallacy",
"Nvidia will build 700-petaflop supercomputer for University of Florida",
"Paul Graham Essay Notes",
"How to validate your startup idea quickly"
]
}

We can similarly add test cases for all of our app’s functionality. By pressing the Play button, we can run all of these tests at once, and confirm that my script is working correctly.

Now that we’ve completely engaged with the Alan platform, let’s go over everything that we’ve learned.

Conclusion

Only the Alan Platform gives you the ability to create a voice assistant that enhances your application’s existing user experience and make continuous improvements that precisely align with what your users want.

With its simplified format, Alan is accessible to any developer, and does the heavy lifting of creating an accurate language model and managing the dialogues so that you can Build, Debug, and Integrate a voice assistant into your app in just a few days.

Building with Alan is simple — the voice scripts are intuitive, scalable, and powerful. After developing your voice script, you can debug your scripts and take full control of your development-deployment stack. Then, you can integrate Alan into your application without making any changes to your existing workflow or UI. Finally, you can develop automated testing for future scripts and efficient deployment.

And to top it all off, you can create an innovative, multi-modal user experience with your favorite front-end JavaScript library that will engross your users in your app like no other and with surprisingly little effort.

With Alan, make your applications hands-free and bring your users the best conversational voice experience.

For reference, view sample Alan projects and SDKs here: https://github.com/alan-ai

Or look at the Alan Documentation for additional projects.

Refer to the following post for more information about Alan: https://medium.com/@alanvoiceai/voice-enabling-your-app-complete-guide-of-how-you-can-add-a-voice-assistant-to-your-existing-46ed72d972df

--

--

Roanak Baviskar
The Startup

Software Engineering Intern at Alan AI. CS/Math at UCSC.