Roku Test Automation — Advanced Tips and Creative Solutions for Common Challenges

Mike Theiss
Globant
Published in
16 min readJun 15, 2022

Overview

In my previous article, Roku WebDriver Test Automation: Lessons From the Field, I provided an overview of the Roku platform’s official test automation solution, talked about the value of automating tests on the platform and gave some tips for leveraging the out of box solution to create effective tests on the platform. In this write-up, I will dig deeper and share some additional implementation ideas and suggestions to take your Roku automation to the next level. Before getting started, I would recommend that you read my previous article linked above, or have familiarity with the core Roku WebDriver solution. This article is intended to go beyond the basics of implementing a Roku test automation solution by talking about some common challenges and present some approaches that can be used to solve them. Though the primary focus of this article is test automation development using Roku WebDriver, some of the concepts and recommendations apply similarly in other OTT test automation solutions.

Handling Dynamic Content and Test Data

Keeping Up with Frequent Updates

It should be expected that the content on a typical Roku channel will be extremely dynamic and personalized. Most streaming services roll out new content daily, with regular updates at various times within a given business day. To engage users, content will almost always be personalized in some way. Even the most basic channel designs usually include some sort of “favorites” or “recently watched” list to make it easy for users to return to content they find of interest. Most channels also incorporate recommendation algorithms to personalize their offering. These designs are not unique to Roku, but when developing end-to-end automated tests for the platform you will need to account for this one way or another.

Use of Mock Data and Controlled Test Data

A common practice when automating UI tests is to use mock data or another controlled test data source. Many consider this a best practice. Mock data would typically be data embedded into the application source repository which could be used to validate the application functionality without needing to request the data externally. Other types of controlled test data could include data that your test automation solution injects with a proxy, or data that lives in another static data source that both the application and your test automation solution can consume.

Running tests with data you can control will allow you to run your tests in a consistent, predictable manner and to design tests which isolate the functionality of your application’s components. It also allows you to ensure that your tests cover specific scenarios, for example, a text field with a certain length, scenarios that include buttons or other interactive elements with special functions that a user may or may not otherwise encounter, depending on what dynamic content is present in a feed for a specific user at a particular time.

In the context of a Roku channel, you will need your development channel equipped with a mechanism to utilize mock data or another controlled data source to leverage this approach.

If leveraging mock data or other controlled test data is part of your team’s workflow, you should certainly incorporate it into your test automation workflow. If you plan to implement gates for continuous integration based on automated end to end tests, it would be desirable to control the content in a way that allows your tests to remain stable.

That having been said, creating a solution to leverage mock data and producing and updating the data itself requires a commitment and investment of time and resources. Some Roku teams I have worked with used very little mock or controlled data as part of their workflow. You will need to test video playback with appropriately encoded streams that replicate the actual end user experience. Consider what you need from your test content and the resources you have available as you come up with a plan to successfully achieve your test automation objectives.

Using Representative, Dynamic Data In Test Automation

In my experience working with video streaming services, I have found that one of the biggest challenges for development teams is keeping up with the constantly evolving, seemingly endless scenarios involving dynamic data returned in content feeds. Many of the most critical bugs I have seen while testing Roku apps have resulted from some sort of unanticipated data coming through the production or test content feed which the channel was not equipped to handle. Many of these issues resulted in crashes. The best way to help ensure that your Roku channel can withstand the chaos of real world data, is to test it with real world data!

The approach I have used to incorporate real world data into test automation workflows, is to ‘pre-flight’ test data. Before the test runner launches the channel and runs a set of tests, the test automation solution calls the same content APIs, in the same way that the Roku channel would. The tests can then consume the same data that the channel does and that data can inform the expected results of the tests and/or tell your test runner the right amount of button presses needed to navigate to the right place to execute a target scenario. In this context, the tests can be dynamic relative to the data that is present. For example, some tests may or may not run depending on whether required content is available or may execute differently depending on what data is present. This will add some level of volatility to your test solution and the tests themselves, but it will also force you to keep track of the service side changes that your Roku channel needs to handle and help ensure that your automated tests properly represent the end user experience,

Predictability and Stability or Representative Results?

The above described approach is not the most conventional in test automation, some advise against this. If you want your tests to run green consistently, this will admittedly work against that to some degree. Personally, I relish the opportunity to make my tests dynamic. I like to use randomization in my tests as well. For example: a test may select a row list and randomly navigate to an item in that row, or randomly select a row on the page to navigate to. Your tests won’t run exactly the same way each time, but that can be a benefit when the tests are thoughtfully crafted and it more accurately represents the types of scenarios that users are likely to encounter. You won’t have unlimited time to run every possible test scenario, so adding some amount of variation to your tests will increase the coverage across test runs and add long term value by exposing some issues that would have been missed by tests that execute exactly the same way each time they are run. I would liken this to regression tests run manually by a test engineer, which tend to have similar variation across runs organically. You will have failures and you will need to invest time and energy towards analysis of the test results, but when done right this can be a great tool to make the product better.

Purpose Built Test Suites

You can always create more than one test suite to fulfill different purposes. As an example, you could create one suite with the most critical, stable tests to fulfill needs such as build validation which require more consistency, while reserving the more volatile and detail oriented tests which require more analysis for another suite. This way, you can get the best of both worlds.

Testing Video Playback

Roku WebDriver Player Method

Requests to the Roku WebDriver player method will return the most essential state information about the video player, including the current player position, the total running time of the clip/segment and the transport state (play, pause, rewind, fast forward, etc.) You can also leverage the Roku Webdriver element method to query any UI elements that load on top of the player. Some elements of the user experience will not likely be reachable via test automation, depending on the implementation, but initially we will focus on those things that we can test.

Figure 1: Sample JSON output returned after querying Roku WebDriver’s player method.

Leveraging and Testing “Bookmarking”

Most Roku channels include a ‘bookmarking’ feature which allows users to resume playback where they left off after exiting the player. Implementing this feature is a requirement for Roku Channel Certification when playing back VOD content longer than 15 minutes. Test automation is a great way to test this type of feature. In this case, the Web services for the streaming video provider should have an API that can be used to get and set the current playback position, which will allow playback tests to be controlled in a variety of ways. You should be able to assert whether playback is resuming at the correct position or not by comparing the actual playback position to the playback bookmark position that the services return just prior to running the test. By setting the playback position via Web services, you should also be able to set-up test scenarios that would be tedious to set-up manually using transport controls, such as starting playback just before an ad break or just before an end card should load. However, making effective use of test automation in this context will require a strong understanding of the specific implementation, including familiarity with the Web services used by the client in this context.

If the service provider provides pre-roll content as separate video files or playlists, the position and running time returned by the Roku Webdriver player method will correspond to that segment. However, some services use ‘server stitched’ playlists to combine all of the video segments associated with the content into one segment. Under the latter use case, you will only be able to check the overall playback position and transport state using the player method, but any interactive elements or indicators that load on top of the player will be testable using standard WebDriver element queries. As an example, you should be able to confirm that ad indicators, countdown timers, skip buttons for pre-roll content and end cards load and function as expected.

Testing Player Ad Integrations

As noted above, you can leverage bookmark API calls to set-up tests which check that UI elements for ad breaks load as expected at the correct time. You can also write tests to confirm that fast forward and rewind controls are disabled during ad breaks. Using Roku WebDriver, these tests would use a combination of element queries, player queries and remote commands. After issuing remote commands, you can check that the transport is in the expected state (example: play vs fast-forward or rewind) using a player query.

One obstacle I have run into testing ad integrations via automation, is that the ad break durations (as well as the actual ad content) can vary across playback instances. In this case, you can likely determine the insertion points for the ads via an API call to the content provider’s Web services. However, your test may not know the expected playback duration of the ad break prior to executing the test since this functionality will normally be managed by a 3rd party service which integrates with the Roku video player. If the channel design displays a countdown timer during ad playback in the UI, you may be able to key off the UI to determine when the transition from ad break to featured content will take place.

Automation Limitations in Roku Video Player

Some interactions in the Roku video player are governed by the operating system. Unfortunately, the OS driven in-player menus are not currently queryable via Roku WebDriver. This means that you most likely will not be able to test that closed caption and language menu options appear as expected or to determine the state of these controls as part of your tests.

Dealing with Crashes and Unexpected Application Exits

Our Biggest Fear

For those of us entrusted with testing Roku channels, perhaps our biggest fear is that a highly visible crashing bug will find its way to customer’s TV screens across the country or around the world on our watch. The result will be even worse if it impacts very prominent titles, like a popular movie being launched directly to a streaming service during a global pandemic! Your goal in creating test automation for Roku should be to find these crashing bugs before consumers can.

Since a primary goal of our test automation is to expose these types of issues, we also need to be prepared to handle these scenarios to some extent when we design our test automation solution. This is one of the more challenging aspects of Roku automation and regretfully, the official Roku WebDriver solution does not contain any features or tools to help you here.

Types of Crashes on Roku

There are two basic types of crashes for us to concern ourselves with:

  1. Logic Crashes
  2. Memory Crashes

An example of a logic crash, would be a scenario where a function is called with a value or data type that the developer was not expecting to handle, causing a logic failure. In this case, when testing a sideloaded channel build with the debugger, the application may freeze and the debug server output, which can be monitored by connecting to the device via telnet on port 8080, will start the interactive debugger. For an end consumer who loaded your channel from the Roku Channel Store, the application will force close in these use cases. These types of crashes are often consistently reproducible by retracing the steps that led to the crash.

Memory crashes occur when the actions of the user combined with the logic the channel uses to manage data bring the application or the Roku OS itself to their breaking points. These types of issues are typically more difficult to reproduce, especially when executing manual tests. Long running repetitive automated tests are a good way to expose these types of issues. When a memory crash happens, it may manifest itself in the same way as a logic crash (via the debugger) but another common result will be that the Roku will shut down and reboot itself while the channel is running.

If you don’t account for these scenarios, when they occur your Roku automation train will run off the tracks. This type of scenario is likely to prolong the execution of your tests because without some type of intervention, your test runner will continue to issue remote commands and other queries to assert against expected test results. Meanwhile, the actual Roku may not even have your channel up on the screen any longer!

Logging and Crash Detection

How can we mitigate this? My first recommendation is to include a logging component in your automation solution to automatically connect to the Roku debugger via telnet and stream the output to a file each time a test run starts. Even if you do nothing more, by adding this capability to your solution, you will ensure that you have the debug output corresponding to your test run saved so you can share it with the BrightScript developers on your team to understand the root cause anytime a crash or other unexpected event happens.

You can build upon the logging functionality further to detect a crash in the telnet stream by looking for matches for the string “Brightscript Debugger>” which will occur anytime the interactive debugger kicks in. You could add a set of tests at the end of your run to check the logged output for issues of concern to explicitly trigger failures based on crashes anywhere in the log files. In addition to detecting crashes using the above string, you could also check for warnings and other scenarios based on logging output for your development team to address. A more sophisticated approach would be to detect the crash in real time and modify the way the tests run to mitigate the issue (more to come on that later).

But what about those dreaded memory crashes that cause your Roku to reboot? In that case, the logging output will likely stop abruptly and your telnet connection to the Roku will be interrupted, so you won’t be able to detect that via a string match in the log output. However, using Roku WebDriver, you can call the “current_app” method to determine which channel has focus. Immediately after a reboot, the current channel method will report “Roku”. If your tests keep trying to press buttons outside the app during a test sequence following such a crash, the Roku may very likely launch another random channel that happens to be installed on the system and start trying to interact with it. Are you starting to see how messy this can be?

Custom Crash Detection Solutions

My solution was to develop a custom monitor service that runs while the test runner is executing to keep track of device state. I have used this for both scenarios presented above (for logic crashes and memory crashes). Since my test automation solution executes against multiple Roku devices in parallel, I track which device has crashed using an ID for each device, but I use the same service for multiple devices. The service can receive GET and POST requests to check the status of the device or update the status of the device. The logic crashes are easy to detect with string matching as described above, so I designed my logging module with logic to POST to the monitor service any time a device has crashed based on the log output from telnet port 8080 from the device.

Tracking the application focus requires a little more nuanced logic. My solution checks the “current_app” at regular intervals and pushes that information to the monitor service. If the expected app is not in focus for multiple checkpoints, you can be assured that your Roku test automation train has gone off the tracks. This will not only handle the case where your Roku crashes and reboots itself, but will also detect cases where your test runner issuing the remote commands gets out of sync with user experience in any way that leads the user outside of the target Roku channel under test for any other reason. This may happen more than you might expect, especially in the event that supporting content services misbehave, which can cause the user to see an error you didn’t anticipate during a test run.

Crash Handling

Once the device gets into one of these failure states, you can decide what you want to do from there, but in a multi-step test sequence involving remote navigation, attempting to return the test runner to the expected state in the middle of the sequence might be a tall order. I have not attempted to do that myself. My goal was not to prevent dependent tests from failing, but to prevent dependent tests from executing in the way that they normally would. It’s a waste of time to continue sending remote commands to the WebDriver and querying the WebDriver elements to check if the state is as expected if we know the device has crashed or the channel is in an unexpected state. My solution uses a “circuit breaker” design to prevent those things from happening.

My team was already using a custom JavaScript library to query WebDriver because we started our project before Roku released their JS library, so I modified our library to include this functionality. For those using one of Roku’s WebDriver libraries (for JavaScript or Robot Framework), you will need to extend the library to support this. In my solution, before each test runs, the monitor service is queried to check the device status. If a healthy state is returned, everything works normally, but once the device enters a definitive failure state, the circuit breaker kicks in and blocks the requests to the WebDriver API to send remote commands, state queries and related polling attempts so that tests fail quickly without submitting the Roku test device to unnecessary torture.

Circuit Breaker Implementation

The ‘circuit breaker’ design pattern is commonly used in microservices architecture. In this case I am using the term more generically, but the concept is similar to the pattern used in microservices as well as a physical circuit breaker that you would find in a typical household garage. Basically, the idea is that you implement methods that allow you to trigger the breaker (example: stop sending remote commands and querying the application state after x number of failures to find the expected focused application occur) and to reset the breaker (start sending commands again once an intervening action has been taken to recover the state of the application).

Though my test suites typically have extended sequences with a series of remote navigation events and state checks, I also start each major test section with remote commands which exit the channel and restart it. This allows that set of tests to be run independently with limited dependencies (though keep in mind that persistent states such as whether a user is logged in or not, or whether parental controls restrictions are in place or not will typically persist across channel launches and will need to be accounted for). The solution I implemented to mitigate crashes calls the monitor service at the beginning of each test block that restarts the app, effectively flipping the breaker back to its default position so that remote commands and state queries can be sent again.

Additional Tips for Crash Handling

  1. If you implement a circuit breaker design, make sure that any methods or functions used to exit and relaunch the app have a way to bypass the mechanism that suppresses other remote commands, else every other test in your suite will fail and you won’t effectively recover from the crash state.
  2. When implementing a solution to test detection of the Brightscript Debugger, you can test for this by adding a “STOP” command to a commonly used function in the channel source before sideloading a build. This way you can trigger the interactive debugger in a consistently reproducible way.
  3. If channel focus detection is implemented as described above, you can simply use your remote to exit the channel during your test or unplug the device to simulate a memory crash reboot.

Conclusion

We’ve discussed a variety of challenges that impact test automation solutions for Roku channels and solutions to address them:

  • We can use mock data to gain more control over our tests and/or pre-flight real data from services so that our tests are more representative of the end user experience and reflect issues exposed by rapidly changing content.
  • Roku Webdriver’s player method gives us core details about the state of the video player. By leveraging API interactions with the content providers’ services, we can set-up a variety of player test scenarios that are tedious to test manually, such as accurately verifying that playback resumes from bookmark positions.
  • Adding logging capability to our test automation solution is fairly easy to do and will help to debug crashes
  • By monitoring logs and the application focus, we can implement other mitigation tactics to allow our tests to run more smoothly and/or fail faster when disruptions happen.

--

--