Voice Tech Podcast
Published in

Voice Tech Podcast

Using Google Assistant as a teacher

Google Assistant can become an excellent teacher: design a custom flashcard Action to exercise your chinese language (or any other language you like!)

Source: freepik

Every student trying to learn some language probably knows flashcards, which allows them to study and remembers words.

Using cards can be boring when the student is alone so Google Assistant can become a fabulous teacher, being able to select a random flashcard and to ask for translation (eventually giving suggestions, too!).

Even if with ActionsOnGoogle Console is very easy to make a small card game through a premade “fill the Google Sheets” development process, sometimes it’s just fun to implement a custom Action to offer a richer User experience.

Moreover, a custom implementation can solve some problems induced by the peculiar nature of the language we want to learn. When speaking about chinese, we are talking a about a tonal language: Actions can’t just handle english and chinese preserving chinese tones, so we have to think about a new strategy to develop the cards.

What is also nice is that the developer does not necessarily need to setup a backend because Firebase Cloud Functions can be used to define the logic behind every intent!

Designing the conversation

Each and every Assistant project should be carried on beginning with a plan of the conversation, so that the developer can handle all the different scenarios during the conversation.

This simple Action will initially greet the user and when will give the user a chinese word to translate when the user asks for it.

The user will the give an answer: if it is correct, the Action will cheer; otherwise the Action will suggest the correct answer and maybe give a sample usage of the word.

Let’s put all those conversation flows in a schema:

Developing the Action

First of all, we have to collect audio files for words pronunciations so that we can later use them in the fulfillment of the action to actually say the word in the card (we can host those files in Google Cloud Storage, on Github or anywhere you prefer as long as it is accessible).

Once we have done with this task, we can begin by creating a new DialogFlow project (you can create one by clicking the gear icon under the “DialogFlow” icon on the left):

When you click on “Save” button, the platform will create all the resources needed to run the project.

Intents and fulfillment are the main items to be understood to carry out this project: intents are basically called when the User says something and action recognizes something to do with the request made; fulfillment just holds the logic behind the intents.

Exploring the environment, you will notice two default intents:

The “Default Welcome Intent” is the first intent to be called when the Action is invoked: it just welcomes the User with the greetings.

The “Default Fallback Intent” is called when the Action can’t recognize what to do.

To begin, click on “Fullfillment” in the left menu and enable “Inline editor”:

Clear the content of the editor and add some AoG code inside to enable actions-on-google library usage:

Build better voice apps. Get more articles & interviews from voice technology experts at voicetechpodcast.com

We are now going to fullfill the Default Welcome Intent, because we want the action to greet the user with some music and to add a suggestion chip to tell the user how to proceed using the action:

Now let’s create a new intent, we will call it “takeWord”. “takeWord” intent will randomly select a word from HSK1 word list and will show it to the user.

While creating the intent, we will also define a set of sentences which will trigger it:

Make sure to define a proper amount of training phrases or the ML can’t be trained well to trigger the intent (and you will hear the fallback one instead!).

Let’s implement the fullfilment for takeWord intent.

To keep this example simple, we will store all the words with their translations as an array of objects like this:

Then we add the intent to the intentMap:

intentMap.set(‘takeWord’, takeWord);

and then define the takeWord() function:

Action will remember the selected word because we add this word to the context, so that when the user answers we can compare the answer to the previously selected word.

Let’s the define the “takeWord — answer” intent as a “takeWord” followup intent.

“takeWord — answer” will be trained to recognize student’s answers:

Let’s proceed by implementing “takeWord — answer” intent fullfilment.

Again we add the intent to the intentMap:

intentMap.set(‘takeWord — answer’, takeWordAnswer);

and implement the function body:

The function extracts the selected word from the context and compare it to the student answer, taking care of splitting multiple meanings with a predefined separator.

The Action reports if the answer is correct or wrong and suggest the next action with a suggestion chip “More”.

We add a “more” followup intent to “takeWord” intent and make sure that has takeWordAnswer-followup context as input context, in order to be sure to avoid ambiguities when saying “more”, which can be interpreted as an answer!

“takeWord — more” is already trained with predefined training phrases and we only have to define a fullfilment. Adding the intent to the map and calling the fullfiment function for choosing an hanzi is just enough!

intentMap.set(‘takeWord — more’, takeWord);

Observe that “takeWord-more” fullfilment is just a recall to takeWord function.

We add a last intent “goodbye” to allow graceful close for the conversation.

We add a defaut response to close the conversation.

The action is trained to catch “bye” words and to trigger a CANCEL event.

The complete fulfillment is given for reference:

Testing the Action

To test the action, click on “See how it works in Google Assistant” in the right side of DialogFlow console page.

The simulator page will open and clicking on “Start testing” you can invoke you action by typing (or saying) “Talk to my test app”.

You will hear the introductive music and the welcome message!

Asking for a word, the action will provide one to translate in english, with correct chinese tones, given by the audio:

Giving the right answer, the action cheers!

Giving the command “more”, the action gives another word.

Giving a wrong translation, the action provides correct answer!

Saying “goodbye” ensure a graceful conversation close.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Juna Salviati

Juna Salviati


Full-time Human-computer interpreter. Opinions are my own.