"OK Google, scroll down." How we integrated Google Assistant VUI with a website

Marek Miś
Aug 14, 2018 · 6 min read

With so much buzz around Voice User Interfaces (VUI), I decided to give it a go and experiment a little bit around it. The idea was to use Voice UI to accomplish something unusual, but accessible to a wide audience at the same time. So, as much as I love all the IoT experiments (such as this one), I had to give up on them, as they require hardware to play with.

It had to be something available online and the decision was made to use our company’s website as a playground.

If you just want to play with it and not read about how it’s made, skip to the bottom of this article now.

What if you could navigate the website with voice?

Oh yeah. Sit back and talk to your screen. What a fantastic way to showcase what VUI can do! Gimmick? Yes. But as a digital agency, we have a right (if not a duty) to play with latest technology — and who knows at what point it will flourish and monetise our efforts.

And yes, things like that already exist - using SpeechAPI you only need HTML and JS skills to build it. But as it quickly turned out, the SpeechAPI support is very poor at the moment:

Google Assistant to the rescue

Having to look for an alternative solution, I moved my research towards smart speakers. Recently at our agency, we’ve been experimenting a lot with conversational interfaces (chatbots), and an extension to those are voice assistants (Google Assistant, Alexa, Siri). It didn’t take long to pick Google Assistant as a primary voice platform for the project — it’s available on every Android device, allows for building conversations with voice and visual interface at the same time, comes with great SDKs and documentation — and is really fun to play with.

The Plan

So, what do we want to achieve? For starters, let’s consider a classic user journey on a corporate website:

  1. User lands on a homepage
  2. Scrolls down a little, to see what’s going on
  3. Starts browsing to areas of interest. Clicks on navigation items.
  4. User reads a bit more about particular area of interest, perhaps plays some videos.
  5. Finally, wants to get in touch and fills in the contact form.

Awesome, let’s do all that with voice.

The Tools

Now it’s getting techy. Here’s what we used:

  1. Dialogflow — to create conversational entry points, intents and manage everything nice and easy in Actions on Google.
  2. NodeJS app — to create custom functions and allow for handling complex queries, otherwise impossible to do in Dialogflow.
  3. Heroku — to host our nodejs app. A paid plan is needed to not let our app fall asleep
  4. Website — ideally build in Single Page Application manner— to allow for smooth page transitions and to make handling sessions easier
  5. Socket.io — to connect VUI client and the website and allow for real time communication

It didn’t take long to put everything together and create the first function, triggered with voice. The most time consuming thing is testing and error handling. It’s incredible how many new functions, journeys, intents and flows you will create by simply observing people using your apps.

One of the challenges was to introduce a synching mechanism to couple 2 devices together, and prevent from broadcasting events to other clients. Thankfully, Stackoverflow is full of good people giving away snippets of code for pretty much every use case. Randomly generated number (server side) checked against occupied combinations and served back to website frontend did the trick. Voice agent then asks to provide this unique number to synch devices, and upon successful operation, socket.io welcomes agent and website in the same room.

Publish your Action

To make our action discoverable, we had to go through the approval process. It takes around 1–2 days for Google to do that.
Attention! Everytime you want to publish even the smallest change done within Dialogflow, you must go through approval process again. It’s quite annoying to be honest, but hey, quality comes first.

The biggest trick was to make one intent discoverable by saying: “OK Google, ask Greenwood Campbell to do something amazing”. That way, we could take the user on our special journey and use this as a context in Dialogflow setup. Context also serves as a kind of memory of the voice agent.

As it turned out later, Google also indexed our intent with implicit invocation: “OK Google, control a website with voice”. You don’t have to mention our agency name Greenwood Campbell at all (it’s almost like getting a website domain for free!) More on actions discovery and invocation types here.

You can still trigger Greenwood Campbell action by saying “OK Google, talk to Greenwood Campbell”, but that way you will never find the website controller. Our experiment was planned to begin by interacting with a website first.

Miś, not meeshh

The most fun (when I say fun, I mean we struggled badly…) we had with the contact form.

First we allowed users to say their name to fill in the fields on the form. But — if your name is not English, or not very common, Google Assistant either didn’t get it, or horribly misspelled it. My last name is Miś — Polish for “teddy bear”, but Google never got it right. And I wasn’t satisfied sending wrong data through the contact form.

But — every Google Assistant user is somehow identified in Google’s database. And there’s potentially all sorts of data to retrieve from your account, right?
Well, not all kind of data. And you must allow Google Assistant to use it. But it works, and provides accurate results. Check out here what you can get access to when building your Actions. Contact form problem solved.

Takeaways

Launching this experiment live boosted traffic on our website by few hundred percents plus a massive decline in bounce rate and great improvement on average time spent on site. It also brought us some recognition on website galleries (and awards!) And we have a great demo of VUI capabilities, even if it’s just a little playful thing. Best takeaway though is learnings of variety of systems and confidence in delivery of Actions on Google to our clients.

Let’s play!

Here it is: https://www.greenwoodcampbell.com/

Use a decent browser and a laptop/desktop screen to initiate VUI controller. You will need your phone with Google Assistant or Google Home around you to interact with it. Have fun :)

If you too are experimenting with VUI and integrations, let the me know about it in comments. I’m always happy to hear about exciting ideas. Cheers!

EDIT: since this integration no longer is part of greenwoodcampbell.com website, I thought a video of how it worked would be useful. Here it is (the sound isn’t great, sorry!)

Coinmonks

Coinmonks is a non-profit Crypto educational publication.

Sign up for Coinmonks

By Coinmonks

A newsletter that brings you week's best crypto and blockchain stories and trending news directly in your inbox, by CoinCodeCap.com Take a look.

By signing up, you will create a Medium account if you don’t already have one. Review our Privacy Policy for more information about our privacy practices.

Check your inbox
Medium sent you an email at to complete your subscription.

Marek Miś

Written by

Voice Assistants Enthusiast | Designer | Developer | Tech Explorer | https://veeheister.com/

Coinmonks

Coinmonks

Coinmonks is a non-profit Crypto educational publication. Follow us on Twitter @coinmonks Our other project — https://coincodecap.com

Marek Miś

Written by

Voice Assistants Enthusiast | Designer | Developer | Tech Explorer | https://veeheister.com/

Coinmonks

Coinmonks

Coinmonks is a non-profit Crypto educational publication. Follow us on Twitter @coinmonks Our other project — https://coincodecap.com

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store