How to Build a Virtual Reality Chatbot using AWS Sumerian | AWS

Published in

Vaibhav Malpani’s Blog

6 min readJul 29, 2019

Disclaimer: This blog will only explain components used for building VR Chatbot.

In my previous blog, I spoke about the importance of having a VR Chatbot over a Traditional Chatbot. Here I’ll explain how exactly would you start building a VR Chatbot.

Steps:

AWS Configuration
Start a Scene
Import a host
Speech Component
State Machine
Dialogue Component

1. AWS Configuration:

AWS Sumerian requires a lot of rules and IAM configuration and it’s quite tough to do it all correctly and quickly (I wasted 2 days on it). So just to make things easier, You can use this link ,which will do all the configuration required for AWS Sumerian.

From the above image, click on “Launch Stack” besides “Using Host, Speech and Dialogue Components”.

After the configuration is complete, go to “output” tab, there you will get “Congnito Identity Pool ID”. Copy it somewhere, you would be need it in the coming steps.

2. Start a Scene

Go to AWS console, seach for “Amazon Sumerian” service and open it.

Click on “Create new scene”, give a name for your scene, click create and wait for it to load.

On loading you will see below screen

On the Top Right hand side you would see “AWS Configuration”. Click on it to toggle and enter the “Congnito Identity Pool ID” in the provided text box.

That’s it! All the configuration for our Chatbot are done now. Let’s get to the Exciting part.

3. Import a host

To create a VR chatbot, Let’s start with Getting a Host for your Chatbot. Host is essentially a face for you Chatbot.

Click on “Import Assets” at the top.

You can scroll down to see lot of “Host Assets”. Select any hosts from various options offered and click on add.

You would see it in the left lower corner

Now to get the host in your scene, toggle the imported “host asset” (in this case ‘Wes TShirt’) and select and drag the host entity(the one in red in the image) to the middle of your scene.

Sometimes the host is too small to even get noticed. Try zooming in to the scene by scrolling up with your mouse.

Your scene is now ready with your host in the middle.

4. Speech Component

Let’s now give a voice to the host.

Click on the host and in the right side corner you will see all the configurations that you can add to the host.

Click on add component and select “Speech” if it’s not present. You can now select a wide range of Voices depending on the gender of your host and also the locale in which you plan to use the Chatbot.

Click on the “+” sign to add a speech. A new window would popup. Here you can add any speech that you want and save it. The newly created speech would be shown in the bottom left corner. Select it and drag it to the speech setting. You can click play besides the speech object to hear the voice select and try changing it till you find a perfect match.

Just below that you would find a “Gesture Map”. Click on it to toggle and then click “+”. It will add a new “DefaultGestureMap”. You can open if from the bottom left corner and change it if you want to. For most of the cases it works quite fine. Select

Now to add Gesture to our Speech, Select the button highlighted in the image.

After that click on edit speech, to see the speech file. Now you might notice that few different lines are added to your speech file.

5. State Machine:

This essentially is a way of guiding the Sumerian entities about how to perform one action after other. Here we have states and actions are added to each step. Using the state machine graph, we can create a path for our entity.

To start making the State Machine for the host, click on host and on the right hand side select “Add Component” and then select “State Machine”. Click on “+” to start building your behaviour.

You would now see a “State Machine Graph”. You can rename the “state 1” to whatever name you want. On the right side of the graph, click on “Add action”. A new window pops up. In the search bar type “AWS SDK Ready” and select it. This action basically checks if all the AWS Configurations are ready to use. On the top on State Machine Graph, Select “Add State”. Click on the state and select “add action” and search for “Start Speech” and select add. After adding the action you will need to select which speech to run on the right side. After this you now need to connect these states. You can do that by selecting “On AWS SDK Ready” and dragging the arrow to the “start speech” state. At the end of all this, your State Machine graph Should look like below.

Now we are done with “speech setting” for the Host. You can now try clicking on “Play the Scene” in the middle at the bottom and Voila!! The host has started speaking. Notice that it is also giving gestures based on the words in your speech.

6. Dialogue Component

To start off you will have to build a Amazon Lex bot. This bot should include the conversation that you want in your VR Chatbot. After you have built and tested the functionality of you Lex bot, copy and save the name of your Lex bot.

Click on the host and on the right hand side select “Add Component” and then select “Dialogue”. Paste the name of your bot in Bot Name and below that put “$LATEST” for Bot Alias to always get the latest version of your Lex bot.

Now the behavior for conversational chatting would be different from a normal speech. So to start conversing with your host we will have to build a new State Machine Graph.

Action used in each state of State Machine graph

AWS SDK Ready
Intro speech — speech that would introduce the bot (optional)
Key down — select a key on right hand side
Start Microphone Recording, Key up — select the same key that you selected in previous step
Stop Microphone Recording.
Send Audio Input to Dialogue Bot
Start Speech — check “Use Lex Response” on right hand side

Join these states as mentioned below: