Evolution of forms — voice-enabled way!

Published in

Globant

4 min readJul 9, 2020

How cool and convenient will it be if we can access the web by just speaking to our machine or while enjoying our Snacks, or doing some chore i.e., handsfree!! It does sound amazing, doesn’t it? Just think how helpful it will be for differently abled people to access the web in an uninterrupted way like this.

“Components are building blocks of the UI, and accessibility of your entire UI is improved if the Components have better accessibility provisions.”

The world is already experiencing how convenient this can be through products like Alexa, Google Home, Siri, and many more.

Living up to this challenge, we have come up with a Sample of speech recognition API, this API is built on the top of SpeechRecognition web speech API, it is native to browser and easy to use.

This speech recognition API targets the native Element of HTML and makes it as voice-enabled such as radio button, input box, drop-down box, buttons, etc,.

So with the help of this API, we have developed a basic use case of a voice-enabled registration form, where users can fill up the entire form without touching the keyboard or mouse, and this takes us to the next level of UI access.

Before we see the demo, let’s check some basic concepts for voice-enabled components

How does SpeechRecognition work and makes speech recognition possible on browsers ?

Web Speech API has been included in selected browsers. This API exposes the `SpeechRecognition` interface which, for browsers like chrome, takes the help of a server-based recognition engine to convert voice to text. As a result, at the time of writing, it won’t work offline.

You can also check the Browser compatibility table for support of this native API

What is the Sample of speech recognition API ?

Sample of speech recognition API is a custom library which has developed by the smart UI initiative team at Globant India demonstrated in UINxt 2020 event.

This API is built in Vanilla js on the top of Web Speech API and has utilized various methods of the SpeechRecognition interface, such as

SpeechRecognition.start(): Starts the speech recognition service listening to incoming audio with intent to recognize grammars associated with the current SpeechRecognition.

SpeechRecognition.stop(): Stops the speech recognition service from listening to incoming audio, and attempts to return a SpeechRecognitionResult using the audio captured so far.

SpeechRecognition.abort(): Stops the speech recognition service from listening to incoming audio, and doesn’t attempt to return a SpeechRecognitionResult.

And the events such as,

onresult: Fired when the speech recognition service returns a result — a word or phrase has been positively recognized and this has been communicated back to the app.

error: Fired when the speech recognition error occurs, also available via onerror property

speechend: Fired when the speech recognized by the speech recognition service has stopped being detected, also available via the onspeechend property

With the help of SpeechRecognition API methods and events, Sample of speech recognition API can target the native UI element of HTML and inherits its property to make it voice enabled

OK, enough of theory, let’s get hands-on! The way we developers love it!

Here you will find the basic setup of speech recognition in your browser code, with the help of this setup you can make several voice-enabled use cases possible.

Start by setting up the speechRecognition object

Then check if the speechRecognition is available in the browser or not?

Create the instance of SpeechRecognition

Once recognition instance is available you can start the speech recognition by a simple invocation

now you can implement its basic method for performing voice-based operations

All the process of enabling voice for native elements of HTML goes here, say for e.g, to make a form of your app voice-enabled can be achieved with the help of this method

If you want to make the recognition to continue its functionality then you can make that happened simply by turning a flag.

In this way, you can restart recognition again automatically, without any manual process.

So lets put this together to form a basic setup of speech recognition in your application

Now, for the real thing! The ultimate demo! Check this out

We were able to use the Speech Recognition API of the browser and implement it in a very basic way for demo purposes. Of course, we can use it in more things for a form and make things completely hands free. Our dream way!

Evolution of forms — voice-enabled way!

Before we see the demo, let’s check some basic concepts for voice-enabled components

OK, enough of theory, let’s get hands-on! The way we developers love it!

Now, for the real thing! The ultimate demo! Check this out

Written by Rohan Ambhore