Project: Custom AV Remote w/ Flutter and Sony Audio Control API

Dave Tate
The Startup
Published in
10 min readJul 10, 2019


Project Summary

A custom audio/video-systems remote control built using the Flutter UI framework, a compact touch-screen, and a Sony AV receiver.



  • Sony STR-DN1080 7.2ch Home Theater AV Receiver (~$600 USD)
  • Alcatel TCL LX A502DL (~$40 USD)
  • Magnetic USB charging cable

The Goal

Problem: Conference room packed with media sources and multiple screens and audio systems.

Objective: Anyone should be able to route any media source anywhere with no difficulty.

A key part of the objective is anyone. Everyone knows what they want to do and how to accomplish it, but nobody could possibly know how somebody else decided to wire up cables to routing matrices or what B3 currently does on an infrared remote. It’s impossible for a first-time user to intuit the designed signal flow paths, which are often deliberately hidden for aesthetic reasons. Worse, people know that some of the choices available to them are necessary, but others could “mess up” the arbitrary configuration in the room. We can do better than a laminated input/output lookup table. The goal is to create a remote control that a first-time visitor could instantly recognize and operate effortlessly and without trepidation.

A wise person once told me:

“Someone has built this already”

This is good advice. Let’s examine what is available: Logitech Harmony offers arguably the top-of-the-line consumer option for a programmable universal remote

The Logitech Harmony Elite (

Crestron is a popular go-to for larger scale commercial remote control applications.

Catchily-named Crestron TS-1542-B-S (

While many of these visual interfaces don’t look bad, there is a balance that must be struck between optimum user experience and providing an easily configurable system that can be a commercially viable “everything-to-everyone” device.

Creating a flexible framework for non-technical users to quickly alter or extend the AV system configuration to suit their needs is not an objective for me on this project; this is a passion project. Set on being uncompromising about UX, I decided to build a totally custom UI for the remote. So: a remote for what?

The Receiver

The heart of this project is 90% of the equipment cost: Sony’s STR-DN1080 receiver. For my particular needs, this was virtually my only option. Despite limited competition, I’m happy with it. Here are my AV requirement particulars which explain this choice:

  1. Route at least 4 input sources to at least 2 video outputs simultaneously
  2. Support audio from multiple sources
  3. Have external control support (RS-232 or network/IP)

These three requirements are deceptively limiting. The first consideration is the signal routing matrix. There are three important terms used by HDMI signal routing device manufacturers:

  • HDMI Switch: Toggles one of multiple source signals to a destination.
  • HDMI Splitter: Duplicates signal from one source to multiple destinations simultaneously.
  • HDMI Matrix: A switch and splitter in one. This is critical in order to route any input to any output, and to use multiple routes at once.

In my case, I have a TV and projector in the room, and we want to be able to use both at the same time, with different sources. That means a matrix is the only option. This cuts the routing devices options to a fraction. I need at least 4x2 (four source inputs, two output destinations), reducing the candidates further. Finally, I need an external control interface of some sort.

Everything supports infrared and comes with an IR remote, but I don’t want the experience to require aiming beams, re-positioning things, and I’d prefer to hide devices completely without forwarding signals to chains of IR repeaters. That leaves just RS-232, the classic serial communication standard, or network options. Very few HDMI Matrices offer network options, and those that do are pricey. Thus, my first approach was RS-232 ports for remote control. I can’t connect a touchscreen remote to a RS-232 port, so I’d need a middle-man. I bought a Raspberry Pi Zero W (~$35 USD), and got busy setting it up as a web server to proxy commands from the touchscreen remote device to command the devices attached to the Pi using an RS-232 USB adapter. Unfortunately, the relatively attractively priced matrix device I purchased (~$180 USD) ended up having a physical RS-232 port, but poor firmware support for the RS-232 specification. Even their own software control application that shipped in the box couldn’t actually switch any routing, which their support team eventually confessed. You get what you pay for.

Instead of buying a different brand RS-232-capable HDMI matrix and hoping it actually worked, I decided to pursue network interface options. Ultimately, I couldn’t find one in my budget. Simultaneously, I was working on requirement #2, audio support. For reasons I don’t understand, very few matrices offer both RS-232 and audio extraction. I would need this to send audio from any source to the amplifier powering the ceiling speakers. Otherwise, I would need to add an HDMI audio extractor to both outputs in order to send along to the audio amp, which itself would then need some remote input switching control; or else a third HDMI output + audio extractor, which hurled me into unnecessary i/o territory of 8x8 matrices (nobody makes a 4x3).

Sony STR-DN1080 7.2ch Home Theater AV Receiver

This is a great home theater device, and superficially total overkill for this project. 7.2 surround? I need two-point-oh. The amplification quality is superb; I need a mediocre amp for the modest ceiling speakers in this conference room, and I already have that. The reason this device is ideally suited, is the price point. Most of the HDMI Matrix + Extractors + Raspberry Pi options I had considered connecting to my current amplifier cost the same or more than displacing everything with the Sony receiver.

Simultaneous routing of input/output in HDMI device-world is called “matrix”, but the same in av-receiver-world is called “multi-zone”. I had initially considered and rejected the 2-Zone STR-DN1080 because it didn’t support multiple video outputs at once — Zone 2 only supports audio. Bummer. But, I kept returning to this device and was confused about what an “HDMI Zone” is on a product that advertises only two zones. So, I went to Best Buy and asked the experts. They confirmed, sadly, no: the device does not support multiple simultaneous video output, despite having two HDMI out ports. No purchase. Still, I kept returning to this device, struggling through the oceans of features I didn’t need, trying to comprehend how this wasn’t exactly what I wanted. Amazon being Amazon, eventually nailed me with a steep discount ad, so I just bought it to experiment. It absolutely does output multiple video sources at once. The HDMI Zone (also labeled Out B) works simultaneously with main Output A and they can have different inputs. As a bonus, this has a network interface, including wireless support, with a full-fledged control API. Nice. And look at this API:

Sony Audio Control API

I’m sure lots of devices have a functional network API, but Sony appears to be giving their API first-class treatment. The documentation is online, easy to find, easy to read, includes functional examples, etc. I used it. It’s good.

I don’t know why it’s called “Audio Control API”. It controls more than audio. You can toggle device power to active/standby, switch HDMI (video) routing and more. Sorry Raspberri Pi, I promise I’ll find something else for you to do.

This meets all my requirements. I get more inputs than I required, can output different sources simultaneously to TV and projector (#1), several sources (although limited) can output audio to the “big” speakers (#2), and they wrote the API for me (#3). Now all I have to do is create the touch interface.

The Remote

What’s the cheapest touch-capacitive screen you can rig up? I think it’s a pre-paid phone: Alcatel TCL LX A502DL.

Alcatel TCL LX A502DL

I spent less than $35 USD an got an Android-based phone with an HD screen. There is probably an Amazon Fire in this range, but an Apple iPod Touch starts at $200.

I know there are $15 touch-capacitive screen elements, but you’d still need a miniature computer with a wireless network adapter, and for more convenience, a battery power source. Someone could, but I can’t build a more attractive and functional self-contained control device than using a cheap smart phone. The A502DL gets it done. I disabled the phoney-things I didn’t need. The screen looks terrible at severe angles, but looking straight-on it’s completely…adequate. With this choice, the remote software is decided: Android app.

The App

Let’s build an app. What if I change my mind and want to support iPod Touch? How can I make what I want quickly without reinventing the wheel?

Flutter is Google’s mobile UI framework for crafting high-quality native experiences on iOS and Android in record time

Sounds good. Flutter is written in the Dart programming language. The framework ships with “MaterialApp” scaffolding, which gives me access to basic UI components which adhere to material design principles. I like material design, and it’s naturally appropriate for an Android app experience, so I used this scaffolding to build.

The specific Dart-language code I wrote is not in the scope of this article. I will say that I liked working with Flutter and found it approachable as a first-time user. I RTFM and built an experience streamlined and customized specifically for the exact use-case and equipment in my room. The app source code can be perused here, most of the business is in the /lib directory:

I installed Android Studio to get access to the Android device simulator. I enjoy coding in VS Code, and the VS Code Flutter Extension was critical to development with the simulator. The debugging toolkit is very good.

Color. Black is harsh and overused. Personally, I prefer more dignified cool grays. However, I’m building a remote control designed to stay on all the time. Remote controls use batteries, and screens burn in and out. There is a considerable penalty for light. Black is the most practical background choice for something that is intended to be left on for long periods, even in disuse. My app is mostly black with white text. Regardless of aesthetic preferences, I think this is a reasonable call.

The app can be made to take over the entire screen, including the system menu bar (kiosk-style), so the user sees nothing except the choice they seek. The experience is the shortest and simplest path to putting the thing you want on the screen you want (as few as two taps), with no extraneous options or flashy icons or symbols. Scrollable lists, screen transitions, and progress indicators make the experience feel premium and natural. The Flutter MaterialApp scaffolding made much of this a breeze to implement. I also wrote in timers to reset to an attractive home screen when idle, promoting our company logo.

Source selection
Destination selection
System power up screen

Something that caught me off guard was the Material Slider UI component. I’d been excited about the bonus that I could potentially control audio volume from the i/o controller as well, thanks to the Sony API. After optimizing the design of the app for a vertical orientation, I was disappointed to find that the pre-fab slider component only works horizontally, which will look and feel more clumsy on a vertically oriented device. I also think it is unintuitive to think about audio volume as being turned “left” or “right” rather than turning it “up” or “down”. UPDATE: There is now a vertical slider available


Finally, I added a magnetic charging adapter and a stand for the remote (smartphone).

Magnetic break-away USB power

I didn’t want the final beautiful product to be a cable-tethered phone laying on the table, or a discharging phone stashed in a drawer. Android system settings permit the display to remain perpetually illuminated while plugged in, so it always looks like what it is, and not like someone forgot their phone.

The stand and magnetic cable combine to give the finishing touch of a professional looking AV controller, inviting unfamiliar users to interact in familiar ways.