A couple of months back I came across a tweet from @freezydorito. She had found an unsecured API on her TV and was trying to build an Alexa Skill to utilize the API and control the TV by voice.
I’m a huge fan of devices controlled by voice — our house has multiple Amazon Echos and is fitted out with Philips Hue lighting in every room, using voice has become second nature. The TVs in my house are Sony Bravias, but they are slightly older and don’t have native Alexa support built-in, so I wanted to see if my TVs still had APIs — to my delight, they did!
My end goal was to be able to control the power, inputs, channels, volume and playback on my TVs. Looking over the API reference from Sony, it appeared this was all possible.
If you’d like to play along at home or get something running with minimal effort, I’ve published all my code to GitHub.
You can’t perform that action at this time. You signed in with another tab or window. You signed out in another tab or…
At a high level, when you interact with an Alexa Skill, your Echo device sends the command to the Alexa Event Gateway, which invokes a Lambda Function associated with your skill. The problem this poses is Lambda can’t directly interact with my TV. Instead, I’ll use a RaspberryPi to act as the interpreter between AWS and my TV, and isn’t overly different to how devices like the Philips Hue hub work.
The way I’ll do this is by creating a Thing in IoT Core, subscribe it to a topic and have Lambda publish commands to that same topic, enabling the RaspberryPi to perform the actions occurring from Alexa.
There are four main parts to making this work, the:
- Smart Home Skill
- IoT Core
- Lambda Function
- RaspberryPi Code
I’ll start with the Smart Home Skill. Before I do, If you’ve never built an Alexa skill before, take a read through the Getting Started Steps. You should also understand what Lambda is and how to create a Lambda function.
1. Smart Home Skill
Navigate to the Alexa Developer Console and create a new skill. Give it a name, select the default language and choose Smart Home as the skill type.
That’s it, all done, nothing more to do… well, maybe not quite!
2. IoT Core
As mentioned earlier, you need to generate an IoT thing, along with a policy, and certificates, to be able to consume messages on the RaspberryPi. I used the Onboard functionality in IoT Core to register a device and generate a connection kit which creates the policy and certificates for me.
Go through the process of onboarding your device and then take a copy of the certificates and the IoT endpoint address.
3. Lambda Function (and Cognito)
A Lambda function needs to be created to perform device discovery and handle all of the events Alexa throws its way. I’ll be using the serverless framework to create my Lambda function.
You’ll need to update the serverless.yml file with your IoT endpoint and Alexa Skill ID.
The code in the Lambda function is based on the information from Creating a Smart Home Skill, it will help you understand the model and the way which Alexa works. There are 5 different namespaces utilized in my code:
- Alexa.Discovery: The namespace which helps Alexa discover the devices it can control (in the case the TV).
- Alexa.Speaker: Controls the volume or mute of the TV.
- Alexa.PlaybackController: Play, Pause and Stop.
- Alexa.InputController: Switching between various inputs attached to the TV.
- Alexa.ChannelController: Changing between Free to Air TV stations.
Because this code is hacked together and is by no-means production-ready, the function DiscoverDevices has one of my TVs hard-coded in it. Ideally, this information would be stored in a database such as DynamoDB as part of on-boarding your device and retrieved as part of discovery.
Cognito. Although we’re not actually going to be using Cognito to perform any authentication, Smart Home Skills require you to authenticate when assigning the skill to your account (even in development), this would usually be to see what devices are assigned to your account with that manufacturer.
We can’t bypass this step, read more on Account Linking and go through the process of Setting up Account Linking. You’ll need to update the Callback and Logout URLs in the Cognito.yml file to be those of your Cognito User Pool.
Although I’m using a RaspberryPi, this code can be run anywhere which runs Node.js — in fact, I developed and tested it initially on my Mac, and now runs in Docker on my RaspberryPi.
To get Docker up and running on your RaspberryPi:
- Install docker
curl -sSL https://get.docker.com | sh
- Add permissions to the pi user to run docker commands
sudo usermod -aG docker pi
- Reboot the RaspberryPi
The RaspberryPi does all the heavy lifting, communicates with IoT Core and instructs the TV what to do.
To achieve this, I used several different libraries:
- aws-iot-device-sdk: Enables the RaspberryPi to connect to IoT Core
- sony-bravia-tv-remote: Pre-generated commands to perform actions on the TV
- axios: Used to make RESTful calls to the TV’s API for more complex commands, such as selecting a specific input.
There are a few more steps before we can make everything work:
- Clone the GitHub repository, specifically the raspberrypi directory
- Copy the certificates generated in IoT Core Onboarding to the directory connect_device_package
- In the raspberrypi directory (the same directory as Dockerfile), create a new .env file, eg. tv1.env and populate it with the following:
# These are the standard device variables
CODE=1234# Details to connect to the IoT Service
CLIENTID=sdk-nodejs-xxxxxxxxxxxxxx# These are the variables for inputs
The first section contains the variables to describe the television you want to control. The device and name can be anything you want them to be. The IP and Code need to be specifically for your TV. On Sony TVs, you need to enable and set the code.
The second section contains details of the IoT service. This should be set to the details generated when onboarding your device.
The third section contains input variables of devices connected to the TV and should match what was declared in the DiscoverDevices function in Lambda. In my case, I have a home theatre amp which is connected to my TV and peripherals. These devices are all controlled by CEC and are addressed differently to your standard HDMI inputs. If using a Sony TV, you can get this information here.
Once this is complete, you run the command to build the Docker image:
docker build -t dockerpi .
Back in the Alexa Developer Console, we need to populate and save the Lambda Function ARN as the endpoint for your Alexa Skill, this will tell Alexa where to direct all requests to interact with your Skill.
We now need to enable the skill on your account using the Alexa Mobile Application. Navigate to Skills & Games > Your Skills > Dev to enable the skill for your account. Remember where I said we needed Cognito to authenticate? This is where you’ll be prompted to enter a username and password which exists within your Cognito User Pool (you’ll need to create this).
Last but not least, let’s start the container. In the directory where your .env file exists, run the command to start the container:
docker run -ti --env-file tv1.env dockerpi
The terminal should produce the output similar to the below if connected successfully:
Gavin@Gavins-MBP-2 % docker run -ti --env-file tv1.env dockerpiconnect
TV 1 is connected
Listening on the topic: tv_topic/tv1
TV IP address has been set to: 192.168.0.xxx
Trying it Out
You can use utterances such as :
- Alexa, turn ON/OFF Device
- Alexa, mute Device
- Alexa, change Device input to Input
- Alexa, change the channel to Channel on Device
- Alexa, Play/Pause/Stop Device
- Alexa, Increase/Decrease the volume on Device
Let’s try something simple like muting the device, this can be done either via Alexa Developer Test Console:
Or by speaking to your Echo device:
Let’s try something more complex like changing the channel — I wanted to check the Cricket Scores on Channel 7:
While the code in the example is nowhere near production-ready, it was a fun project to undertake over my Christmas break. Although it could be argued that using a remote is far simpler when in front of the TV, using voice can be useful when you’ve realised you’ve left the TV on while walking out the door or even wanting to mute an ad while preparing dinner.