Overview of how to setup and run PocketSphinx for offline voice recognition on your Qualcomm Dragonboard 410c
Disclaimer: You don’t need a 3.5mm connection for the microphone. You can use any analog mic, but this article is designed for quick, off the shelf, prototyping.
- Privacy concerns.
- Poor/non-reliable/no internet connection.
- No cost after certain number of uses.
- Don’t want to spend time packaging your data into JSON and constructing a HTTP request ( more for programs using C ).
What you will need
The Dragonboard 410c, power adapter, and SD card for booting Linux is standard.
This part is the personal decision, but I have found great success with my earbuds that came with my phone.
For recording via 3.5mm jack your two options are wiring up your own 3.5mm barrel jack connector or buying the audio mezzanine board (I personally chose the Mezzanine board as it also offered level shifters to 3.3V and 5.0V for GPIO pins and serial UART via USB which is super useful for the low price).
The Dragonboard and Linux
I am gonna assume you have already setup Linaro’s Debian Linux on your Dragonboard as there are already great documentation how to set up your Dragonboard with Linux.
Update and add Linux packages
First you will need to make sure you update your packages
~$ sudo apt update
~$ sudo apt upgrade
~$ sudo apt install autoconf libtool automake bison python-dev swig libasound2-dev
The main package to notice we installed is
libasound2-dev as this is the ALSA library headers we need.
Linux Audio Subsystems
“The installation process is not an issue if you understand the complexity of audio subsystems in Linux.” ~ PocketSphinx FAQ
If you are not familiar with how audio works in Linux, this is where you need to just take a deep breath and bear with me for this sections as it will be useful in later steps.
ALSA vs PulseAudio
For sake of time here is what you need to know to get started as anything more I encourage you to do your own research.
- ALSA - Low level drivers for fine control of audio options.
- PulseAudio - Wrapper that uses ALSA but takes care of connecting multiple sources of audio I/O
There is nothing wrong with using PulseAudio, but for this tutorial we are using ALSA since the Dragonboard 410c has few audio I/O by default.
Understanding ALSA settings
You will see in future commands things such as
-c 0 and I want to quick explain where these settings come from. If you run the command
cat /proc/asound/cards you will see this:
This is showing that the Dragonboard only has one audio card, therefore, all settings directed towards it using commands like
amixer will reference
-c 0. Another command to check is
cat /proc/asound/pcm and you will see this:
plughw:0,1refers to the output speakers on your board.
plughw:0,2refers to the input microphone we will be using.
PocketSphinx is a great library designed for embedded devices such as the Dragonboard. First thing is somewhere on your device is to clone the two repos.
~$ git clone https://github.com/cmusphinx/sphinxbase.git
~$ git clone https://github.com/cmusphinx/pocketsphinx.git
First we will setup the Sphinxbase.
~$ cd sphinxbase
The big thing here to check for is that you configured it to use ALSA which can be seen in the log.
The big thing to note here is that Sphinx looks first for PulseAudio library header ( Note: Having PulseAudio installed and having the development libraries are two different things ). If it detected PulseAudio we are going to want to reconfigure it.
How to remove PulseAudio headers ( if needed )
There are two options for this, the first is to simply remove them with
sudo apt purge libpulse-dev or the other option is to remove the
AC_CHECK_HEADER(pulse/pulseaudio.h section from the configure.ac file and running
Finish install Sphinxbase
To finish Sphinxbase we just need to make and install it.
~$ sudo make install
This is the easy part.
~$ cd ../pocketsphinx
~$ sudo make install
Add PocketSphinx path
You can either add these two commands to your
.bashrc or just call it each session.
~$ export LD_LIBRARY_PATH=/usr/local/lib
~$ export PKG_CONFIG_PATH=/usr/local/lib/pkgconfig
NOTE: If you see
error while loading shared libraries: libpocketsphinx.so.3 then you forgot this step.
Configure your microphone
This is the codec inside the Dragonboard 410c and there are only 3 commands you need to run to configure it.
~$ amixer -c 0 cset iface=MIXER,name='DEC1 MUX' 'ADC2'
~$ amixer -c 0 cset iface=MIXER,name='ADC2 MUX' 'INP2'
~$ amixer -c 0 cset iface=MIXER,name='ADC2 Volume' 8
‘DEC1 MUX’ ‘ADC2'tells the board to use
ADC2(ADC = Analog-to-digital converter).
‘ADC2 MUX’ ‘INP2'tells the board to use
MIC3_INwhich is what the 3.5mm jack is hooked up too.
‘ADC2 Volume’ 8sets the volume which can goes from 0 to 8 with 0 being 0dB and each value incrementing the gain by 6dB.
To test your microphone you can use
arecord -D plughw:0,2 -d 5 -r 48000 -f S16_LE test.wav to record your voice into a
.wav file for 5 seconds.
You can also test
Create a PocketSphinx Dictionary Set
You can now test PocketSphinx with there demo using
pocketsphinx_continuous -adcdev plughw:0,2 -inmic yes and you will find that is awful and has a very low accuracy. As I am not sure why this is the case by default the way to get super high accuracy is by creating your own dictionary of words that can be recognized. PocketSphinx uses two files for its recoginization, a
.dic dictionary file and
.lm QuickLM Language Model file. I recommend creating a blank
.txt file and each line write a single sentence or word you want added to your word bank. From here take that file and use the Sphinx LM Online Tool to convert it to a
.lm file for you.
There are various ways to run PocketSphinx and I have various demos to show how you can incorporate this in your C/C++ program.
If you have set up all the settings above you can run this demo with no prep.
~$ wget https://raw.githubusercontent.com/sjfricke/Dragonboard-Voice-Recognition/master/Linux-ALSA-PocketSphinx/demo.dic~$ wget https://raw.githubusercontent.com/sjfricke/Dragonboard-Voice-Recognition/master/Linux-ALSA-PocketSphinx/demo.lm~$ pocketsphinx_continuous -adcdev plughw:0,2 -inmic yes -dict demo.dic -lm demo.lm
Once you run it you can say any of these words I added in the demo list and witness your new voice recognition enabled Dragonboard: