HACK: On-device wake detection system.

Published in

The Andela Way

4 min readJul 12, 2018

This post details how to build an “On-device wake word detection system”, relatively quickly. An example of an On-device wake system can be found on some iPhones, where it listens for the sentence “Hey Siri”, and then activates the assistant.

I think it is important to note that, while we have the opportunity to implement this rather quickly, the underlying technology that drives On-device wake word detection is quite extraordinary. The system we will build upon is Open-Source software called PORCUPiNE, provided by Picovoice.

PORCUPiNE uses deep-neural networks, trained in real-world situations. It works across platforms, and boasts of better performance than competing products.

We will build a really quick demo for Python. I chose Python because it is the technology I am most conversant with, but PORCUPiNE also works on Android, iOS and IoT devices.

In the end, we will have the demo application running on our computer’s terminal, and it will tell us when the keyword is detected.

The running application listens for the keyword, and prints to the shell.

Getting started:

PORCUPiNE can be cloned to your machine; however, I found that process takes a little more time because of how heavy the entirety of the repository is. If you choose to follow this route, demo-ing your system is as easy as installing a few dependencies and running a few commands.

I however have extracted parts that are useful to us to a separate context, allowing for a faster download, and little to no excess. If you prefer just a barebones strip down specifically for python, I have made it available in the instructions below. The dependencies are shared between these methods, so the next steps are important regardless of what route you choose to take.

Installations:

$ brew install portaudio

PortAudio is an audio I/O library. We will use this alongside PyAudio — python bindings for PortAudio — to listen through the computer’s microphone.

A python virtual environment will come in really handy for these reasons, but aren’t mandatory. After activating the virtual environment, we can go on to installing the python packages that serve as dependencies.

$ pip install pyaudio

For python versions less than 3.4, the python bindings require an installation of enum34

$ pip install enum34

Quick-Demo

If you have chosen to use my barebones extract and installed the dependencies, clone this repository. A python script that runs an instance of PORCUPiNE, listening for the word “blackberry” can be started by checking into the repository folder and running the command:

$ python python_binding.py

The shell listens and reports back when the word “blackberry” is called, just like in the gif above, We will discuss how to run your own instances, and how you can customise the behaviour very soon.

Full-Clone

If you cloned the full repository, you can run the demo-commands provided by PORCUPiNE

$ python demo/python/porcupine_demo.py --keyword_file_paths resources/keyword_files/blueberry_${SYSTEM}.ppn

where ${SYSTEM}is replaced by the type of operating system you use (e.g. windows or mac). Saying the word “blueberry” will trigger the detection system.

How can you build yours?

PORCUPiNE instances take some parameters that allows access to some key resources. These parameters are described in the parts of PORCUPiNE documentation that deal with integration.

The library_path parameter points to the C library provided by PORCUPiNE. That can be found in the lib/${system}/${machine} folder of the original PORCUPiNE repository. ${system} should be substituted with your time of operating system (e.g windows or mac), ${machine} refers to your system architecture. The library_path parameter wants a path to that file, so you can the file into where you need it, but make sure you point to it.

The model_file_path parameter contains the path to the model parameters. This can be found in lib/common/porcupine_params.pv of the repository.

The keyword_file_paths parameter points to the files that hold your keywords.

The sensitivity parameter determines the miss rate. There is a trade-off that occurs in this kind of problems that exists between false alarms and misses. In Machine Learning for example, you want less false alarms when you predict that a patients’ cancer might be malignant, whereas when you are recommending Netflix shows, you might not really care about the occasional bad movie. Adjust the sensitivity parameter to which of this your implementation requires, higher sensitivity, higher false alarms.

There are other optional parameters that can be found in the docs.

When an instance is instantiated, you can add your own custom code to do whatever you want when the keyword is detected. Start a command, launch missiles, topple relatively stable governments so you can install your own dummy replacements and control the country’s natural resources, anything you want.

In the slight modifications of the code found in the extract’s script file,

handle = porcupine.Porcupine(library_path,
                             model_file_path,
                             keyword_file_paths=keyword_file_paths,
                             sensitivities=sensitivities)

instantiates a new instance, providing the necessary arguments specified in the documentation.

while True:
        pcm = audio_stream.read(handle.frame_length)
        pcm = struct.unpack_from("h" * handle.frame_length, pcm)        result = handle.process(pcm)
        if num_keywords == 1 and result:
            print('[%s] detected keyword' % str(datetime.now()))

The code above prints to the shell when a word has been detected.

The steps I took (yours should/could be different) to create my instance of PORCUPiNE where:

create a new folder for the project
activate virtual environment for python and install dependencies
move the files needed by the parameters into that folder
write script that instantiates the PORCUPiNE instance, and prints to the shell when the word is detected.

PORCUPiNE also allows you create your own custom keywords using their optimizer , as well as use multiple keywords concurrently. Dig into their docs for all the things you can do.

Thanks for reading, would love to see what you build from this tutorial.

Do you need to hire top developers? Talk to Andela to help you with that.

HACK: On-device wake detection system.

Quick-Demo

Full-Clone

How can you build yours?

Written by Yamen