Roku WebDriver Test Automation: Lessons From the Field

Published in

Globant

13 min readJan 4, 2021

Overview

A study from earlier this year found that 80% of homes with TVs in the United States have at least one connected TV (CTV) device. Roku branded Smart TVs and set-top devices make up 39% of the CTV market in the U.S. (Forbes and Adgate 2020). The first Roku set-top box introduced in 2008, brought Netflix’s streaming video service, still in its infancy, to consumer’s TV sets. Shortly thereafter, the Roku Channel Store was launched and an SDK was made available to allow developers to easily release content to the platform. Today there are over 1800 Roku channels available.

Though Roku has established itself as the leading CTV platform in the U.S. over the last 12 years, the ability to effectively automate tests on Roku is a recent development. In December 2019, Roku, Inc. released Roku WebDriver, an official solution for Roku test automation, derived from the popular open source Selenium WebDriver, used on other platforms. Prior to that, options for automation on the platform were extremely limited.

When Roku introduced their automation solution, I had just begun working on an experimental Roku test automation project, so my team was able to leverage Roku WebDriver shortly after it was released. Over time, the solution we have built using Roku WebDriver has proved invaluable to my team as it has allowed for fast and thorough feedback, expediting the identification of defects.

Though derived from Selenium, Roku WebDriver is a subset of the solution optimized for CTV, and is still in its infancy. Additionally, the way users generally interact with CTV user interfaces and the specifics of how Roku channels are developed present additional challenges that impact test automation teams. In this article I will provide an overview of sorts and share tips learned from direct experience working with Roku WebDriver over the past year since it was released.

The case for automation on Roku

There are a lot of great reasons to automate tests as part of a Roku channel development effort, but here are some highlights:

Tedious Button Sequences: since the standard means for interacting with a Roku channel is via a simple remote with a basic d-pad controller and a few other buttons, some important operations can be fairly tedious to execute (example: typing out an email address and password to authenticate using an on screen keyboard). Though CTV experiences don’t require users to execute these functions frequently, they are critical scenarios to include in regression tests and are not difficult to automate.
Store Certification Requirements: there is a fairly stringent set of Certification criteria that need to be met to release a channel to the Roku Channel Store. These requirements include deep linking standards which can also be time consuming to test manually because there are a variety of different deep link scenarios that all channels must support. These use scenarios can be automated easily using Roku WebDriver.
Development Tooling Limitations: development of Roku channels is done using a language called BrightScript, which is almost exclusively used for Roku channel development. BrightScript lacks the tooling that is present on other modern programming languages, so practices such as writing unit tests and linting are not standard across teams and the way channels are architected is likely to vary widely across organizations. Also, the BrightScript language itself is not as advanced as other languages. Until the latest 9.4 OS release that just came out, there was not a BrightScript equivalent of try/catch. Due to these factors, Roku channels can be brittle, a simple coding error can cause a channel to crash. Without standardized code analysis tools to find these types of mistakes before a build is generated, there is an elevated need for testing, and automated tests are especially valuable.
Device limitations and compatibility: Roku devices use inexpensive hardware, so the processing, memory and storage resources are much more limited than that which would be present on a typical laptop or smartphone. This is especially the case with older Roku models, which all channels must support. Automation can help us find when an application is being pushed to its limits. Memory and performance defects of this nature can be difficult to isolate and reproduce in a consistent manner via manual testing. The opportunity to automate scenarios that involve a lot of repetition helps to identify problematic scenarios and offers a way to measure progress while working to resolve performance issues.

Note: it is not possible to use virtualization to simulate a Roku. For content security reasons, the platform is highly protected, so you must use physical Roku devices for all development and testing.

With all of these challenges, my recommendation is that all but the most simple Roku channel development efforts should plan to invest resources in test automation. This will reduce the effort required for manual regression testing and allow for more efficient and comprehensive testing overall.

Getting your hands dirty

To get started with Roku WebDriver, refer to the API documentation on the Roku Developer site here:

Roku WebDriver Developer Documentation

On the above page, you will also find links to the Roku WebDriver API documentation, Roku WebDriver Robot Framework Library and Roku WebDriver JavaScript library. If you would like to use another language like Java, C# or Ruby, that should be feasible as well, but since Roku has not provided libraries for those languages you will need to write your own library to call WebDriver’s REST API or look for third party libraries externally.

At the core of the solution Roku provides, is an HTTP server solution that can be used to query the status of Roku devices under test as well as send remote commands via ECP (External Control Protocol). Roku provides the following:

A GitHub repo with source to compile the server in Go language
WebDriver libraries in Robot Framework and JavaScript
A Postman collection to query the WebDriver API
Sample test scripts and a demonstration channel you can sideload on a test device to run the tests against.

Note: you will need a Roku device running Roku OS 9.1 or higher to execute tests. To sideload an application for testing, you can use a standard Roku device purchased through retail or online channels (no special ‘dev kit’ is required as with some other CTV platforms). However, you do need to run a special procedure on the device to set it up as a development device before you can sideload builds to your device for testing. In case you are not already familiar with that procedure, you can learn how to do that in this blog article.

If you will be developing tests for the platform, I suggest following the step by step process Roku prescribes on the Roku WebDriver Developer Documentation to compile the server binary, sideload the test Roku channel and run the sample tests they provide in Robot Framework and/or JavaScript. From there, the Postman collection they provide is a great way to dig in deeper to get a better understanding of what’s under the hood.

The sample tests Roku provides are a great start, but they only scratch the surface on what is possible when automating tests on the platform. They provide the most essential building blocks, but you should not expect the amount of convenience methods that you get out of the box with a solution like Selenium for Web or Appium for mobile testing. Remember that Roku WebDriver is not yet a mature, widely adopted, time tested solution. It has only been publicly available for about one year, so we will likely see enhancements in the future, either through updates provided by Roku directly or via open source add-ons. For now, expect to do some of your own innovating to extend the solution.

In the example test scripts included with the Roku WebDriver solution, most of the tests are simple remote navigation events, followed by basic queries to check for the presence of text, poster images or other elements in the source. This is a great start, but, you will be able to write more effective tests if you are able to assert against the relationship of the components on the page, as opposed to firing off a series of queries to see if various elements are present.

Roku’s sample test methods do not typically parse the response returned by the WebDriver HTTP server. In most cases, they just check to see if the query to the API returns a successful response (200 “OK”) as opposed to a failure code (typically 500 “Internal Server Error”). However, when you send an “element” query to the API, it returns a JSON format response representing the underlying component that was queried. So you can always write a query that will return a view, widget or other element within the channel under test and then write your own code to parse the contents of the JSON that is returned. A common example, would be testing a grid contained within a Roku channel, which would have a series of image thumbnails representing multimedia content available for the user to view. In this scenario, you can write a query which returns the “grid” element JSON, then you can parse the response to return properties you care about.

For example, you may want to test the following:

Items in the grid appear in the correct sequence.
Correct image file source url is referenced for each item.
Images in the grid were successfully downloaded from their source.
Correct title loads for each item in the grid.

These types of checks will require parsing the element JSON or XML directly.

**Fig 1. Screenshot of a grid in a Roku demonstration channel.**

Working with Roku’s XML source

Roku channels use an XML based source to define application components, which are controlled via BrightScript language in code. Though this model is similar to the way that HTML and JavaScript are used in Web applications, the Roku solution is much less advanced. Until just about five years ago when Roku introduced “SceneGraph” to allow for rich UI development, Roku channels were designed using BrightScript to manipulate predefined UI templates. There is not a direct equivalent to CSS on Roku. From a test automation view, working with Roku XML is rather different from working with an HTML/CSS/JavaScript DOM, so I will touch on some core concepts here.

Firstly, the renderable UI elements in a Roku channel are made up of three primitive element types:

Labels
Posters
Rectangles

Labels display text, Posters render images, and rectangles are self explanatory.

You will also encounter elements of a variety of types including lists, grids, layout groups, renderable nodes and any number of custom elements, but all of those elements use the three primitive elements above to draw the UI elements on screen.

The Roku WebDriver API has a source method which returns the XML source code in base64 encoding. When I develop tests for a new component in a Roku channel, I manually navigate to the view I will be testing against and use a Postman library to query the source API method to review the XML for my channel in the state that I will be writing assertions against (since the response is returned in base64, I use a script in Postman to decode the base64 to XML and then copy the XML to a text editor and format it for easier reading).

**Fig 2. Sample excerpt of XML representing a grid component in the Roku demonstration channel.**

I have found reviewing the XML to be the easiest way to visualize the structure of the underlying elements that my tests will be asserting against. In practice, my automated tests send element queries to the Roku WebDriver API instead of source queries, but though the element JSON directly correlates to the XML, it is not very human readable, which is why I use the XML to understand the underlying structures. On my project, we use a custom library to parse the element response body. Each time we will be testing a new channel component, we write code to parse the JSON response for that component to return the properties of that element which we want to assert against in a simplified form.

**Fig 3. a POST request sent to a local instance of Roku WebDriver to retrieve information about the focused “StandardGridItemComponent” element shown in the XML example above.**

**Fig. 4. A portion of the JSON wire protocol response received from the above elements query.**

Since all Roku UI elements use the three primitive element types above to render elements on screen, our main task when parsing the JSON responses will be to locate those primitive elements and their properties. However, we will also want to get a sense of the associatiations they have with parent elements and structures.

Properties you will likely want to check in your tests will include text for Labels, urls for Posters and properties which describe the focus state and visibility of primitive elements. It should be noted that some style attributes will not be available for analysis in the properties returned by Roku WebDriver. Font details and text size, for example, typically will not be returned as part of the XML source for you to check. Also, asserting whether or not a given element actually appears on screen can be done using the ‘bounds’ property that is returned, but these values are relational within the XML tree and therefore complicated to assert against. Hopefully, this type of operation will be simplified in the future as the Roku WebDriver solution matures.

Remote interaction

As with most other CTV platforms, the way that a typical user interacts with a Roku channel is through the basic remote control which comes with the Roku device or television. The primary buttons a user will use to control the device within a Roku channel are the d-pad buttons (Up, Down, Left, Right), the Back button and the transport controls (Fast Forward, Rewind, Play/Pause). Roku also provides a mobile application which mimics the physical remote control buttons and includes a virtual keyboard for character entry. These remote functions can be simulated via the Roku WebDriver API. However, the WebDriver API does not currently support press and hold functionality, which users can use to navigate across a number of items fluidly. If you wish to include tests which use press and hold functionality, you can call the device directly using ECP, which provides a function to achieve this.

Since Roku channels (and most other CTV platforms for that matter) do not have a mouse or touchscreen to interact directly with elements on a given page, a lot of click by click navigation within a channel and its various pages is required to access many functions. This will have a big impact on the way you structure your framework and tests. You will likely want to implement helper methods in your solution that navigate your channel in various ways through a series of keyboard interactions. The more configurable your solution is, the more prepared you will likely be to adapt to changes that may happen as the development of the channel progresses over time. Though there are techniques you can use to detect some failure states and account for them, a change as simple as adding a new button to the page or changing its location can break a test sequence. You will need to design your navigation sequences to be robust enough to tolerate a certain amount of runtime inconsistencies (example: wait long enough for a page to load before attempting to navigate within it), but you will not be able to account for all possible scenarios.

This presents a notable dilemma when it comes to test script design on the platform. It is considered a best practice in automated testing to design your tests to execute independently. However, there are a couple key considerations that apply here:

Many features within your Roku channel will require a significant series of remote events to navigate to.
The ability of the channel to reliably respond to remote navigation events is also critically important to the end user.
Application launch and navigation events can take a significant amount of time to execute.
A Roku channel that is not designed in a performant way may present issues to users which only occur after extensive navigation within a user session.

My team has obtained a lot of value from “workflow” style test scripts which navigate through a series of events and make numerous assertions along the way to check the state. This means that many of the checks depend heavily on the results of the prior events occurring as expected, which is not a conventional approach. One failure early in a workflow can cause a domino effect, causing all of your subsequent tests in that sequence to fail. This creates some risk of fragility in the test suite. However, you can achieve efficiency in this manner by executing a high number of tests in a shorter amount of time, which I think is important on this platform.

Workflows that cover a lot of ground by executing many page navigation events in a session can expose performance issues in a channel, such as a memory leak that causes a channel to become sluggish or unresponsive. These types of issues are notoriously hard to identify and reproduce via manual testing, so there is a lot of value in having test scripts that can reproduce these types of issues in a consistent manner. This approach may also increase the efforts required to monitor test runs and analyze automated tests results, but on my project the risk has been worth the reward. Ultimately, you will need to make thoughtful decisions based on the needs of your application and consider tradeoffs between test independence, test stability and the ability to get fast feedback.

Conclusion

Roku’s release of a WebDriver API for their platform has opened the door for effective UI test automation. Though the Roku WebDriver resources are relatively new and therefore are not as mature as the automation tooling for other platforms, they give us the basics we need to get started with automated testing using JavaScript or Robot Framework languages. Since the automation is standardized around an HTTP server, there are great opportunities for test automation development teams to leverage and further extend these resources to meet the specific needs of their team and their Roku channel(s).

The strong demand for high quality experiences and the specific needs for testing on the platform (including Roku Channel Store Certification criteria) make investing in test automation a worthwhile endeavor.

Citations

Forbes and Brad Adgate. 2020. “Connected TV Viewing Is Not Returning To Pre Pandemic Levels.” Forbes.com. https://www.forbes.com/sites/bradadgate/2020/06/12/connected-tv-viewing-is-not-returning-to-pre-pandemic-levels/?sh=107087463aa5.