Hacking AirPlay into Sonos

Earlier this year, I bought a Sonos Play:3 speaker. I was impressed with the hardware and sound quality, but the Sonos controller software left me wanting; I was especially unhappy that there was no AirPlay support built-in — a feature I’d expect out of expensive wireless speakers. Unsatisfied, I wanted to put together a solution.

There were two pieces required to get AirPlay on Sonos working: code that could speak the AirPlay protocol and pull out audio, and code that could convince the speaker to play the audio. You can check out the end product, AirSonos, on Github.

AirTunes

First off, I needed to understand how to talk AirTunes, the AirPlay audio protocol. Reverse engineering AirTunes was daunting upfront, but turned out to be manageable thanks to some unofficial documentation, alongside a lot of Wireshark dumping.

A lot of Wireshark dumping.

Wireshark let me peek at network traffic while I streamed audio from my iPhone to an AppleTV, which proved useful for figuring out what was really going on behind the scenes.

Some digging revealed that Apple based AirTunes off the existing RTSP and RTP protocols, along with Bonjour for service discovery.

Each protocol plays a different part: RTSP brokers the connection between the devices, the actual audio is streamed out via RTP, and Bonjour announces to the network that an AirPlay device is available.

I’m pretty sure that’s a demon.

When it came to time to decoding the audio, I hit a roadblock. Apple encrypts the RTP stream with an AES key that changes per connection. While I learned enough basic crypto to debug what was going on, I got an interesting mix of incomprehensible static and demon voices coming out of my laptop.

Eventually, I figured out that I could use an AirPort Express’ private key that someone else had kindly dumped to decrypt the audio, effectively masquerading as a real Apple-approved AirPlay device.

After some tinkering, I came up with nodetunes, an implementation of the AirTunes protocol in node. Nodetunes pretends to be an AirPlay compatible server, getting an Apple device to send audio its way. Sound goes in via iOS/OS X’s system audio, and comes out of the nodetunes end as raw 16-bit PCM data. All of the funny magic that behind streaming audio gets abstracted to code that looks something like:

var Nodetunes = require(‘nodetunes’);
var server = new Nodetunes({ serverName: ‘NodeTunes Speaker’});
server.on(‘clientConnected’, function(stream) {
// yay audio stream
});
server.start();

Sonos

The second half of the problem was to get my shiny new audio stream actually playing on the speakers. Digging through the Sonos documentation, it seemed there wasn’t a way to get the speakers to just accept a stream of PCM data. The only method of continuous audio supported would be through internet radio stations.

Sonos’ internet streaming is backed by the Shoutcast protocol, made popular by Winamp back in 1999. Some prior art made hacking together a Shoutcast compatible node server a relatively painless ordeal. The protocol turns out to be pretty simple: the server emits a stream of MP3 audio over what is essentially HTTP, and metadata is injected in every few thousand bytes.

Putting it all together

The sum of these parts is AirSonos, a node package that detects Sonos devices on a network, and provides a servers that you can connect to via AirPlay. Running AirSonos looks something like:

$ airsonos
Searching for Sonos devices on network...
Setting up AirSonos for Portable {172.17.107.66:1400}
Setting up AirSonos for Playroom {172.17.106.196:1400}

After it starts running, the AirSonos “devices” should be accessible from any iOS/OS X devices on the network.

AirSonos devices in OS X

AirSonos works with node version 0.10.28 (you can use n if you have a different version of node), and you can install it with:

npm install -g airsonos

There’s still a ton of room for improvement (i.e. running AirSonos via a terminal script is not the best) — feel free to open issues, and send along feedback!


One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.