“Computer! Tea, Earl Grey, Hot”: Offline Voice on NodeJS

David Bartle
Feb 11 · 3 min read

Articles on speech recognition have no shortage of Star Trek references. Indeed, in 2017 Amazon added the famous “Computer” wake word to Echo devices as an alias for “Alexa”, in a nod to the legendary television and film series. In 2021, it’s now possible to do recreate this experience on commodity hardware that processes voice privately and entirely offline. Let’s recreate the replicator, where Captain Picard orders his usual beverage, in NodeJS.

Image for post
Image for post
Private Voice AI understanding the captain’s beverage order, in NodeJS

The first step is the “Computer” wake word, or hotword: always-listening voice commands that serve to trigger a device to do something, including listening for subsequent (and typically more complex) naturally-spoken phrases. We’ll need to complete three steps to get this up and running:

  1. Setup the microphone
  2. Create a new NodeJS project and add dependencies
  3. Run the script to listen for “Computer” and output detection events

1. Setup the Microphone

The trickiest part is getting data from a microphone, since — to my knowledge — there’s no built-in cross-platform Node package that can directly access the microphone.

Instead, we’ll use the helpful node-record-lpm16 package, which essentially fires up a separate process (SoX or arecord) and provides the audio to us in Node. This means you need to setup SoX or arecord, so that the node-record-lpm16 package can access it.

See this documentation for platform-specific instructions to setup your microphone. This will work for macOS (x86_64), Linux (x86_64), and Raspberry Pi (2–4, running 32-bit Raspberry Pi OS).

2. Create a new NodeJS project using npm (or yarn):

Create a new folder called “replicator”, initialize a new npm project, and install the Porcupine (wake word detection) and node-record-lpm16 (microphone) dependencies:

mkdir replicator && cd replicator
npm init -y
npm install @picovoice/porcupine-node node-record-lpcm16
touch index.js

Open index.js and paste in the following script (if you used arecord instead of SoX, change ‘sox’ in Line 14 to ‘arecord’):

NodeJS script with Porcupine that detects the “Computer” hotword. Note: the recorder here is set to use SoX.

3. Running the demo

Run the script, and say “Computer”:

$ node index
Listening for 'COMPUTER'...
Press ctrl+c to exit.
Detected 'COMPUTER'

Voilà! Our NodeJS script is recognizing the “Computer” wake word.

Understanding the code

The essence of the code is:

  1. Continuously receive microphone “data” events from node-record-lpcm16; as the package name suggests, we need linear 16KHz PCM audio, the de facto industry standard format for speech processing libraries
  2. Accumulate the data into arrays of integers. Once we’ve accumulated enough to fill an entire frame of audio, pass it to Porcupine and receive essentially a yes/no response for “Computer”
  3. Save any remaining frames, for the next “data” event

We set up an instance of the Porcupine engine that will listen for “Computer” at a sensitivity of 0.5. Sensitivity is a parameter between 0 and 1 that trades false alarms for false detections. You can increase or decrease it based on your particular scenario.

const Porcupine = require("@picovoice/porcupine-node");
const { COMPUTER } = require("@picovoice/porcupine-node/builtin_keywords");
const recorder = require("node-record-lpcm16");
const porcupineInstance = new Porcupine([COMPUTER], [0.5]);

Note that both arguments to the Porcupine constructor are arrays. This is because Porcupine supports listening to multiple wake words simultaneously.

Now that we have the wake word, the next step is to understand the follow-on command: “tea, earl grey, hot”. Continued in Part II.

Picovoice

Edge Voice AI Platform

Thanks to Ian Lavery

David Bartle

Written by

Picovoice

Picovoice

Picovoice is the end-to-end platform for building voice products on your terms. Unlike Alexa and Google services, Picovoice runs entirely on-device while being more accurate.

David Bartle

Written by

Picovoice

Picovoice

Picovoice is the end-to-end platform for building voice products on your terms. Unlike Alexa and Google services, Picovoice runs entirely on-device while being more accurate.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store