Prototyping Speech Recognition in Framer.js

Brian Bailey

Published in

Framer

3 min readAug 3, 2016

Project available on GitHub.

What you’ll learn

How to connect to the Web Speech API
How to access your device’s audio input
How to access your device’s voice synthesizer

What you’ll need

The sample code
Framer Studio (or Framer.js and a text editor)
Basic knowledge of HTML, CSS, and Javascript (CoffeeScript)
Chrome Stable >33

Overview

The Web Speech API provides web apps the ability to recognize voices, transform the audio input into strings, and control the synthesis voices available on the device.

Its two parts, SpeechRecognition (Asynchronous Speech Recognition) and SpeechSynthesis (Text-to-Speech) allow designers to prototype speech-based conversational UIs like Google Now, Apple’s Siri, and Amazon Alexa.

The Web Speech API is flagged as an experimental feature in Chrome and Firefox, and is supported in Chrome Stable 33 and greater.

SpeechRecognition Prototype

tl;dr This prototype will not function in Framer Studio.

You can interact with the sample prototype of Google’s iOS app — using Chrome — or clone this repo. Your browser may request permission to use the microphone.

Framer Studio, the official coding environment of Framer.js, is a Safari browser application, which doesn’t fully support the SpeechRecoginition interface of this experimental API. (Safari supports the SpeechSynthesis interface, however.) Framer Studio will likely give the error below and you may not be able to interact with your prototype’s preview in the IDE.

TypeError: undefined is not a constructor (evaluating 'new SpeechRecognition')

*This prototype will not function in Framer Studio.*

To get around this, we’ll run python -m SimpleHTTPServer [port] in the directory of the prototype's index.html file and use Chrome 33 or greater when interacting with our prototypes. (SpeechRecognition doesn't trigger the microphone in Framer Studio's-generated server.)

Open Terminal
cd into speech-recognition.framer
Type: python -m SimpleHTTPServer 8090
In Chrome, navigate to http://127.0.0.1:8090/

This will now show the prototype in the current working directory.

Create a server and open the prototype in Chrome for full functionality.

SpeechRecognition Interface

The SpeechRecognition interface allows us to recognize speech and respond accordingly. PromptWorks’ piece on Speech Recognition in the Browser provided the snippet below as JavaScript, which I converted to CoffeeScript (and then Framer.js) with js2coffee.

You can paste this in Framer Studio and open it with Chrome.

Your browser may request permission to use the microphone.

# This API is currently prefixed in Chrome
SpeechRecognition = window.SpeechRecognition or window.webkitSpeechRecognition# Create a new recognizer
recognizer = new SpeechRecognition# Start producing results before the person has finished speaking
recognizer.interimResults = true# Set the language of the recognizer
recognizer.lang = 'en-US'# Define a callback to process results
recognizer.onresult = (event) ->
  result = event.results[event.resultIndex]
  if result.isFinal
    print result[0].transcript
  else
    print result[0].transcript
  return# Start listening...
recognizer.start()

Now we can do any number of things with the audio, which is now a string. For example, you can pass the output as HTML to a layer.

textBox = new Layer
	backgroundColor: "none"
	color: "#969696"
	html: "Speak now"textBox.style =
	"fontSize" : "50px"
	"fontWeight" : "300"
	"textAlign" : "left"
	"fontFamily": "Arial"recognizer.onresult = (event) ->
  result = event.results[event.resultIndex]
  if result.isFinal
    textBox.html = result[0].transcript
  else
    textBox.html = result[0].transcript
  return

SpeechSynthesis Interface

The SpeechSynthesis interface provides controls and methods for the synthesis voices available on the device. Browser compatibility is better with this interface, with support both in Safari and on several mobile browsers.

Snippets from PromptWorks.

speechSynthesis.speak new SpeechSynthesisUtterance('Hello world.')

Incrementing utterance.voice = voices[1] should allow you to cycle through your device's synthesis voices.

voices = speechSynthesis.getVoices()
utterance = new SpeechSynthesisUtterance('Hello world.')
utterance.voice = voices[1]
speechSynthesis.speak utterance

References

PromptWorks — Speech Recognition in the Browser
MDN — Web Speech API
MDN — SpeechRecognition Interface
MDN — SpeechSynthesis Interface
js2coffee 2.0