More SSML for Actions on Google!

Silvano Luciani
3 min readDec 18, 2017

--

Back in April Leon introduced you to Actions on Google support for SSML, presenting the first set of SSML elements supported by Actions on Google.

We’re now introducing a second wave of elements that allow you to even further improve the atmosphere of the overall user experience of your app for the Google Assistant.

Let’s check all these new elements with some examples!

How to test the examples.

First of all, let me tell you how you can test the examples I’m presenting on your own, it won’t be much fun if you can’t actually listen to them :)

The easiest way is to use the TTS simulator included in the Actions Console. Open the Actions Console in a browser window and select any existing project that you might have; if you don’t already have one, create a new project. Select Simulator in the left navigation bar, then select the AUDIO tab in the Simulator.

The SSML simulator in the Actions on Google console.

To test the examples from this post, just copy the SSML into the TTS simulator and then click UPDATE AND LISTEN.

Control rhythm and sound with <prosody>.

Using the <prosody> element you can control rate, pitch and volume of the speech output. Try this self-explanatory example in the TTS simulator!

M-O-R-E P-R-O-N-U-N-C-I-A-T-I-O-N

We’ve added support for more format attributes for the <say-as> element: fraction, unit, expletive and spell-out. Check them out with the next example!

Sound mixer!

Now let’s check something unique to Actions on Google that will make your apps more fun and engaging, using <par>, <seq> and <media>. You can use these new elements to add background music and sounds to your speech elements, thus creating more engaging atmospheres for your apps.

<media> allows you to define media to be rendered; you can then use <par> to create a parallel media container or <seq> to create a sequential media container. Using attributes of media you can also define timing relationships between media elements, fade in/out effects and looping.

Let’s hear a sample! We will ask a question, then reply 2 seconds after the end of the question. At the same time, we will play a cat purr sound 3 times, applying a fade in effect over 2 seconds. Just before the end of the answer, we will also play a “boing” sound.

A very cool use case enabled by this addition is the ability to play a continuous background sound or music, and overlap different TTS prompt on top of it. Let’s check another example where we play a background music while we present a series of prompt to the users, having the music loop stop at the end of the prompts.

Pretty cool huh? :)

Next Steps

You can find more detailed information about all the elements supported by Action on Google in our SSML docs, and a wide selection of sounds that you can use in your apps in our Sound Library. If you want to see examples of all the supported element, check our new SSML sample.

We can’t wait to experience what you will create!

--

--