Controlling a Raspberry Pi with a phone

Allison King
Cortico
Published in
9 min readJun 14, 2019

Cortico launched its Local Voices Network earlier this year, an initiative designed to bring under-heard community voices, perspectives and stories to the center of a healthier public dialogue. Members of a community come together and participate in small group discussions gathered around a “digital hearth,” a specialized device that records the discussion and also has a speaker that enables hosts to play clips from previous conversations to prompt discussion and cross-pollinate diverse perspectives across the community.

The digital hearth

When we first set out to build the digital hearth, we only had a few requirements:

  • Record a conversation on an 8 mic array
  • Play back audio clips
  • Some sort of indicator for recording state
  • Easy for the average person to control

Most of this can be done with a tablet or a phone, but the 8 mic array allows us to get information on who is speaking at different times. This sort of information is very useful for our downstream conversation explorer.

We ended up deciding to use a Raspberry Pi to interface with an 8 mic array since it gave us a lot of flexibility. We could also hook it up to an LED ring in order to indicate recording state, as well as a speaker to play back audio. The only requirement left was that it be easy for someone, such as a conversation host, to control it (start recording, play back audio, pause, stop recording, discard recording, etc.).

At this point, we had not written much software yet. We did not know if the easiest thing would be to attach a touch screen to the Pi and build out an app that would run on Raspbian (the default Raspberry Pi operating system), or to build a web application that would require pulling up a browser on the Pi, or figure out another way entirely to interface with the Pi.

We tried out a touch screen built for Raspberry Pis but its responsiveness was lacking, especially around the edges, and its on screen keyboard very difficult to use. Compared to the touch responsiveness of a smart phone, this touch screen was unacceptable. So we began to wonder, how can we control this Raspberry Pi buried in this wood encasing through a phone?

Architecture

At Cortico, we have a good amount of experience working on web applications. A web app runs anywhere that can access the browser so we figured if we started developing a web app, we would keep our options open as we experimented with the limits of the hardware. Even if the hearth were offline, we figured we could cache the app and it would still work.

We played around with the idea of the Pi recording and streaming its audio to the phone, then allowing the phone to upload the audio file with its WiFi. This turned out to be intractable since our audio files, with 8 channels of audio data over the course of about an hour long conversation, ended up being around 2GB each.

In the end, we decided that the Pi would be responsible for everything except the user interface. On the phone, we could use a browser to access our website that can then begin talking to the Pi. The website would be nothing more than a reflection of the state on the Pi.

Pi and Phone responsibilities

While this answered the question of what the Pi would do and what the phone would do, we still had not solved the question of how they would communicate to each other.

Networking

Our first thought was to use the Pi’s bluetooth capability to connect to the phone. The Raspberry Pi 3 comes with both bluetooth and an on board WiFi chip. In this scheme, the Pi would act as the web app server. We initially set up a bluetooth Personal Area Network (PAN), set up a backend to run on the Pi, then tested to see if we could connect to the Pi via bluetooth and hit our endpoint via a mobile browser.

We wrote a simple Flask app on the Pi with a few endpoints. By default, a Raspberry Pi’s hostname is pi.local, so if you are on the same network as the Pi (even a bluetooth PAN), you can access it via that name instead of its IP address. Flask hosts on port 5000 by default, so to hit an endpoint from a phone connected on the same network, you would navigate to http://pi.local:5000/api/...:

Phone to Pi communication

Hitting the endpoint worked great! But there were a few things about the bluetooth connection that didn’t work great:

  1. It was fast enough to send back JSON data, but noticeably slow for loading frontend data (e.g. endpoint / hosted our front end JavaScript code). Eventually, when we began streaming volume data to the phone via websockets, this was also noticeably slow.
  2. Connecting to bluetooth was sometimes troublesome— the Pi would have to go into pairing mode and then the phone would have to search for it. If the connection was lost, we would need to put another interface on the Pi to make it go into pairing mode (such as a button) that connected physically to the Pi.

We decided to move away from bluetooth and to instead use a WiFi access point. WiFi access points are frequently used in Pis if you want to set up a guest WiFi network in your house. We followed some documentation on the Raspberry Pi site to make the onboard WiFi chip of the Pi into an access point.

This did mean that our Pi no longer had an outgoing WiFi connection. But this was easily solved by buying a USB WiFi dongle which would act as its outgoing connection.

Phone to Pi communication via WiFi access point

In this architecture, the Pi is serving up our website and broadcasts a WiFi connection called hearthnet which is visible on any device as a WiFi option.

List of WiFi options from a PC when multiple hearths are online

The WiFI access point gave us a few advantages over bluetooth:

  • Easy and intuitive to connect to
  • Password protected WiFi connection
  • Ability to forward the Pi’s WiFi connection to the phone

This last ability in particular also gave us another great unexpected benefit— the ability to connect the Pi to WiFi connections that typically need a splash screen to get through. Because our hearths are deployed to public libraries, we worried that many of these might have splash screens that need to be agreed to first in order to access the internet. There are a lot of articles online about how to get through these on a Raspberry Pi, but all of the ones we found required opening a browser on the Pi and figuring out how to click the ‘accept’ button. With this method, instead, the splash screen is forwarded to the phone which can then accept on behalf of the Pi 🎉

Building the App

In the end, we had the following running on the Raspberry Pi:

  • Flask backend running via gunicorn that starts up on boot through systemd
  • React frontend hosted by nginx which is sent to the phone when it connects to the Pi
  • Various Python cron jobs for uploading and downloading audio, for updating its status to our servers, and for downloading and running updates on itself

This architecture lent itself well not just for phone and Pi communication, but also because it is just a web app, it was easy for developers to develop on their own machines and most things would work just fine once on the Pi. There were a few things, though, that could only be tested on the Pi/phone setup, such as:

  • Setting GPIO pins (to control LEDs, for example)
  • Setting WiFi (though some of this may have worked if the development machine were a Linux machine)
  • Some details on how the front end looked on an actual phone

Backend

We developed a Flask app with REST endpoints such as:

  • Record (start, stop, pause)
  • Play (start, stop)
  • Set WiFi (to connect the Pi to WiFi)
  • Setting volume

We also had some websocket connections:

  • Streaming the current volume
  • State (recording state, playback state, name of host, etc.)

Unlike many web applications, our backend had to keep its state since it was writing out audio files. Because of this, we could not ask gunicorn to spin up multiple instances or else the stored state would be wrong across different threads. For the most part, this was not a problem though, since there would only be one client connected to each backend. Even if multiple phones connected at once, they would show the same screen since we set our state through websockets.

Frontend

We originally started in Preact, a smaller alternative to React, but once we moved away from bluetooth, we were no longer constrained by bundle size so switched back to React.

The frontend gets all of its state from the backend via websockets. It can set the backend to go into different states, such as recording, playing an audio file, or setting the Pi’s WiFi. In this screenshot, the volume data, as well as the time elapsed, is being streamed from the Pi over to the phone via websockets so that the host is confident that their conversation is being recorded. We also set the endpoint that starts recording to have the Pi change the color of the LED ring to orange (instead of its default green).

From this screen, a host can also play ‘highlights’ from other conversations in order to bring other voices into a conversation.

We did hit a rather large inconvenience when we wanted to be able to grab the phone’s geolocation in order to know about where a conversation took place. While this can be done pretty easily in most browsers, recent updates to modern browsers do not allow grabbing geolocation over an insecure network. Our Pi to phone connection is not over HTTPS and so our app was not allowed to access geolocation data. Instead of rolling out certs to all of our Pis and maintaining them, we chose to build a native app that was just a webview with Cordova. This allows us to access data such as geolocation. Since it is just a webview, we don’t really have to worry about app store updates— we can just roll out a new build of our frontend to the Pi and the phone will grab this new version the next time it connects. The phone app being a webview also avoids the case where we update the backend on the Pi but the phone app has not been updated yet, resulting in possibly incompatible interactions between frontend and backend. Instead, the two are always in sync 👍

Conclusion

Controlling a Raspberry Pi through a phone via a web application proved to work out pretty well. The user interface of a phone is familiar to most of our hosts and made development similar to what we were already used to doing. Furthermore, the possibilities for what the phone can control are anything a Raspberry Pi can do, from recording and playing audio to controlling lights and motors.

That said, there is a bit of configuration to get the networking right between the WiFi access point and a WiFi dongle, as well as setting up the backend and frontend. So we made a repo with an Ansible script that set up a basic app with this architecture for you! Enjoy, and we hope with all of our hearths that this might help you make something awesome!

--

--