Raspberry Pi Smart Speaker and Roll-Your-Own Services

Syed R Ali
Geek Culture
Published in
5 min readJul 18, 2019
Best smart speakers
Image by Steven Stone from AudiophileReview

After a read and a think, I have pretty much decided my next home technology project will be a Raspberry-Pi powered Smart Speaker. I’m doing this for a few reasons, where possible I like to roll my own services at an application level; for example, my own private media streaming service using a virtual private server, cloud-based storage, and Plex as opposed to just using Netflix; my own multi-mode (i.e. same content accessible via a TV as well as on my smartphone) console replacement PC/retro gaming service using different emulators, the LaunchBox front-end, and again cloud storage rather than simply getting a Nintendo Switch; my own eBook service using similar cloud storage, the Ubooquity book server, and client apps across different devices rather than the Amazon Kindle ecosystem.

Doing so generally involves some Linux command-line scripting for automation or configuration and knowledge of what applications and existing services to use. This isn’t much for anyone remotely technical or very time-consuming but isn’t something you can ask an average consumer to do.

Cloud computing
Image by vschlichting from Depositphotos

This is done mainly due to my background as a former technologist; it allows me to explore and play with fun new technologies! Also, critically I don’t like having to choose between competing services, especially with media and content being exclusive to different services; it’s bad for the consumer, in my opinion, to have to switch between different services simply because they want to watch a different TV show say.

When exclusive content is tied to a specific application, it defeats much of the point of information technology and computing, which is meant to decouple data from a medium or source. Finally, while upfront costs like time or equipment can be higher than simply subscribing or buying existing services & devices over a medium to a long period, the costs work out far cheaper than often paying to multiple competing services or having multiple different devices.

Voice assistant evolution
Image by Shreyas Sali from Medium

Given my existing home-rolled services, the next obvious place to look at wiring together my own service built on existing applications would be a smart speaker. The requirements for this would be to access both Amazon Alexa, Google Assistant, and the Open Source AI voice assistant Mycroft; from the same device, with possible extension in the future to any other voice assistants.

It would need to play music or any other recorded media like audiobooks from different streaming sources such as Spotify, Google Play Music, or my media without switching applications. Ideally, it could run commands on the controlling device or event interact with Home Assistant/haas.io for any future smart home automation.

Home automation to conquer the world
Image by AtReef from YouTube

While implementations of all these different applications are available on almost every platform, several requirements mean it makes sense to use a new device. The smart speaker always needs to be on and available for use. It would be located in the bedroom of my home where I would use it the most, typically for news at the start & end of the day or playing background music, but not interrupt my sleeping.

My existing gaming PC/workstation has a powerful fan and a hybrid solid-state/mechanical hard drive, both making noise, high power requirements that mean it’s not a suitable platform. Similarly, my existing HTPC/console replacement PC, similarly while being almost silent without fan issues, suffers noise-wise due to a similar reliance on a standard mechanical hard drive making it unsuitable.

My tablet and smartphone aren’t appropriate due to short battery life, typically meaning they can be unavailable at times. My media server is a virtual machine hosted in a data centre with no direct, physically accessible interface.

Image by Markku Rossi from SSH.COM

This means a new device is needed for running the applications and accessing the online services that will drive the smart speaker. The new device needs to be fanless and has only solid-state storage with no mechanical parts for minimal noise and good connectivity in Wi-Fi & Bluetooth. This gives a couple of hardware options.

The first is a fanless PC with an SSD. Several available, mostly based around the Intel NUC or existing mini PC paradigm intended for media PC or industrial usages such as in digital signage, presentations, or anywhere silent running of equipment is critical. The second is the Raspberry Pi, a cheap, commonly available well supported single-board computer typically used in many different small technology projects that require a low powered digital controller.

Image by Quiet PC from Amazon

The hosting OS requirements have good support as many different individual applications and software will be needed. It needs to be open and extensible as I have no idea what problems I may encounter that may need extra work beyond simple installation configuration.

It needs to be easily scriptable for wiring automation between the different applications and components that will complete it. This leaves Linux as the obvious choice with a Debian based distribution due to its extensive support and package management system.

Raspberry Pi boxed
Image Nico Kaiser from Flickr

After looking at prices, the total components needed, and considering vendors; the obvious choice becomes a Raspberry Pi running the Raspbian OS. This fits all the requirements above while still being slightly cheaper than its nearest alternative.

Other necessary hardware components will be a Bluetooth speaker with a built-in microphone to function as the smart speaker itself and necessary user interface hardware to get the device powering the speaker initially configured, such as a keyboard, mouse, display etc. However, these would be unused once the smart speaker is working smoothly, at which point it would be the primary interface for its own controlling device.

--

--

Syed R Ali
Geek Culture

Londoner, desi, financial technologist, geek, weight training & combat sports junkie.