The Ultimate Solution for Spatial Navigation on React Based Smart TV Apps

Published in

Norigin Media Tech Blog

11 min readMar 28, 2022

Building cross-platform web applications is becoming a requirement due to the increased variety of Connected TVs and SmartTV devices available today. Most of them are integrated to the interactive Web 2.0, frontend javascript libraries like ReactJS, and can be used to build the apps on these devices.

As one can imagine, people consume TV streaming content on browsers like Chrome, Firefox or Smart TVs from Samsung Tizen, LG WebOS or newer Hisense Vidaa devices. At Norigin Media, we build streaming TV apps for media companies and use ReactJS extensively to efficiently launch apps across a multitude of browsers and smart TV devices.

Open Source Spatial Navigation

While building TV Streaming apps or any Web application, one must consider the large number of content assets that need to be navigated around, with key navigation on browsers or remote control navigation on Smart TVs and Connected TVs.

Instead of building specific and repealed logic for directional navigation across all these apps, Norigin Media created an open-source library that can be used by front-end developers building apps and using React across devices.

The Norigin Spatial Navigation library eases the effort needed to develop navigation logic on websites & apps, controlled by your keyboard (browsers) or remote controls (Smart TV or Connected TV). With this library, software developers only need to initialize the service, add the hook to components that are meant to be focusable and set the initial focus.

This Spatial Navigation library will automatically determine which components to focus next while navigating with the directional keys.

See our other blog which discussed tech choices for building Smart TV or Connected TV Apps: https://noriginmedia.com/2020/11/24/tech-choices-for-connected-tv-app-development/

This open source project addresses Smart TV or Browser Navigation with React.

User Input Methods within Apps or Browsers

When developing TV Apps for Smart TVs (or other Connected TV devices like game consoles or set top boxes), one must consider the unique aspect of their “User-Input” methods. This is normally how we navigate between assets on devices via specific directional keys on Remote Controls offered with TVs.

While there are many common ranges for remote controls, some are rather unique, like the LG Smart TVs use a pointer input (Magic Remote) and Apple TV uses a directional touchpad.

All apps on such TV devices or browsers offer a lot of content that requires intuitive and seamless navigation between assets. This way of navigation is called Spatial Navigation (2D or Directional Navigation).

To interact with all assets on the browser or TV screen, we browse by moving the focus (navigate to) a chosen asset and press a selection button (“OK” button, “Enter” button, etc.) when the element is focused. There should always be only one focused element on the screen.

Developers normally need to implement the logic for this type of navigation themselves and individually, as there is no default implementation. At least not on the Web platforms. The complexity of this feature is often underestimated and it can be quite challenging in certain scenarios.

There is always a risk of introducing bugs like having more than one focused element on the screen or losing focus completely. This would then require a strict and robust state management system to keep track of what is focused on the screen and how to transfer the focus when transitioning between screens or modal elements.

Most common patterns to implement Spatial Navigation

Luckily React has quite a lot of ways to organize and manage the state, so let’s have a look at a few patterns of how to implement spatial navigation.

Distributed Navigation Logic. This is perhaps the most straightforward pattern where each component keeps the state of which child component is focused now, and also handles the key events to decide what to focus in response to those events. While this method might give full control of navigation logic inside one component, it is not the most scalable solution. It requires implementing navigation logic for every component. This logic gets spread across the whole application and might take nearly 15% of the codebase.

It also means that navigation logic needs to be tested for every single component and any improvements done in one component don’t benefit other use cases. Another issue with this approach is that all the components need to be aware of their parent components logic (e.g. expecting some prop from parent to indicate that it is focused now) as well as the children components structure. It becomes tricky when developing UI components in isolation from each other when components might be frequently rearranged and ideally should not be aware of each other.

When components are replaced or moved around, it also requires updating the navigation logic in all relevant places. In the example below you can see a simplified implementation of distributed navigation for GalleryRow that renders multiple GalleryItems. It handles horizontal key events and updates the currently focused item index:

Sample code for distributed focus logic

Focus Maps. This is another common pattern while working with spatial navigation. A Component might have a Focus Map, an object that is pre-calculated for each direction and contains the focus keys (focus IDs or indices) of the elements that need to be focused in response to keypress events. This allows for keeping parent components clean from the key handling because it’s done in the children components. The parent component is still responsible for constructing the Focus Map.

Sample code for Focus Maps

This method is just a way to delegate key handling to children components. In some scenarios, it makes it easier to define special cases like for example focusing the Side Menu when pressing Left from the first element in the Row. However, it’s still yet another variant of Distributed Navigation Logic and doesn’t scale well.

Helper Components. In order to organize the spatial navigation logic within the app we can use helpers such as FocusableComponent, HorizontalList, VerticalList, Gridto handle directional key events and manage the state of focused children components. This might help to encapsulate the navigation logic and to easily wrap any component inside FocusableComponent. These helpers can store the current focus key in a context, and every child FocusableComponent can subscribe to this context and see when it gets focused.

This pattern is used in BBC T.A.L. and is the middle-ground between Distributed and Centralised Logic. The downside of this pattern is that you have to follow the strict structure of your components tree and organize them in the rows, columns, or grids as well as wrap every focusable component. In case when you have a dynamic layout or using A/B testing in your app it might get hard to maintain since you need to update the structure of rows/columns whenever some components are moved around.

Please note that this pattern also introduces the concept of Focusable Tree, where your focusable components are grouped inside their parent focusable Containers/Areas.

Doing it Smart

Since we are making apps for Smart TVs, our navigation system also has to be Smart, otherwise, it wouldn’t work ¯\_(ツ)_/¯

At Norigin Media, we care a lot about Developer Experience (DX). We are constantly working towards improving it and making our own developer lives easier. Good DX brings better motivation, which brings a better quality of the code and makes us more efficient. The main motivation to create our own solution for spatial navigation was to have excellent DX when implementing this feature in any Smart TV app, or any other web app that requires directional key navigation. All the solutions above are still far from the perfect scenario that you can imagine from the developer PoV. So what is the easiest way of implementing it in the code? Could we build a smart system that will allow us to simply say “I want these components on the screen to be focusable” and it will figure out how to navigate between them? Why do we have to care about rows, columns, etc. if all our components are already on the screen and we know their dimensions and coordinates? How can we avoid handling the parent-to-child focus propagation manually? The inspiration for this has come from the article from Netflix.

Implementation

Creating Focusable components

What is the minimal effort to make Component focusable? One way is to create a wrapping component and use its render props to provide the focusable functionality, i.e.Focusable:

<Focusable>
  {(focusProps) => <Component {...focusProps} />}
</Focusable>

This requires creating another nested level in JSX, as well as introducing this wrapper in render functions, contributing to the “wrapper hell”.

Another way is to use HOC (higher-order component):

const FocusableComponent = withFocusable()(Component);

However, HOCs approach is considered deprecated, as well as the most popular library that was providing a lot of useful HOCs, recompose . It also contributes to the “wrapper hell”. We have used this API in the past, but recently we have migrated this functionality to React Hooks. We have implemented useFocusable hook that is used in React components that are meant to be focusable:

useFocusable hook example

So now when we have a focusable component, what is the minimal functionality that this component needs to be enhanced with?

First of all, it needs to have a focused state to indicate when it is focused. Also, each focusable component needs to have some focusKey to identify it. In order to navigate between focusable components, something needs to store the global state of the currently focused component on the screen. We didn’t want to have any navigation logic inside the components. Each focusable component needs to be registered in a global system and provide a reference to its DOM node on mount (for example by assigning its ref in the example above). As well as delete itself from this system on unmount. So the next step is to create the global system or service to keep the list of all focusable components and manage the state of the currently focused component.

Centralized Navigation Logic

Spatial Navigation Service is keeping the focus key of the currently focused component. It also serves as a Storage for all focused components on the screen. Since we are not handling any navigation logic in the components, this logic is encapsulated in the Service (centralized). The navigation logic itself is quite straightforward. When the user presses the directional key, Spatial Navigation Service is trying to find the best candidate to be focused next in that direction, based on the shortest distance between the currently focused item and potential target item. The algorithm itself is quite advanced and inspired by this implementation.

A simplified explanation of the navigation algorithm

The Service also provides an interface for any focusable component to imperatively set focus to any other component, or onto itself:

const {focusSelf, setFocus} = useFocusable();

Each focused component needs to be connected to the Service. In order to do this, we use the DOM ref, that is passed to the Service (as seen in the example before), so it can use it to measure the component’s size and coordinates for the navigation calculations.

Focusable Tree

Even though each focusable component reports its own dimensions and position on the screen, the UI on the screen is not linear, it has a certain hierarchy. We can have focusable elements inside scrolling lists or other wrappers/containers, so relying only on the global coordinates on the screen to measure the distance is not enough:

Menu item gets focused based on the shortest distance by global coordinates

On the picture above, when we try to navigate to the left, the next element according to the global coordinates is one of the menu items, but the expected behavior would be to focus the next left element in the scrolling row, which is out of the screen (marked with a dashed border).

In order to improve this, we have to structure our UI into a Focusable Tree. We can make scrollable lists as focusable components, even the whole page can be a focusable Container. In the example above we can make Menu focusable (green border) as well as the scrollable list (blue border).

If we restrict the directional logic to prioritize sibling components first, the system will focus the next item in the scrollable list (so we can scroll to it afterward):

The next left item gets focused based on the distance inside the parent (blue) wrapper

But what if there are no good candidates to be focused amongst siblings anymore? This can be solved via delegation of the directional action to the parent focusable component:

Left navigation gets delegated to the scrollable list (blue border) and then performed from its edge to the closest sibling, which is the Menu (green border)

In this scenario, the system attempts to focus the sibling element to the left, but there is none. It delegates the “left” action to the parent list component, which then attempts to focus its sibling to the left, which is the Menu.

But focusing the Menu itself is not really enough. Intuitively we expect it to focus some Menu Item. This is done via down-tree propagation:

Menu that got focused in the previous example automatically propagates focus to the first child item

Even though it sounds complicated with the up-tree and down-tree propagation, you don’t have to worry about it since this is all done “automagically” by the Service.

Putting it all together

Here is an example of a simple implementation with the Menu, Menu Items and Gallery Row with Items inside:

Sample app implementing Norigin Spatial Navigation. Styles are omitted.

A more advanced example can be found on our Github repository for the Norigin Spatial Navigation library.

Debugging

In the real-world scenario something might go wrong, the focus might jump somewhere where you don’t expect it to jump, or disappear for some reason etc. To make it easy for you to understand what’s happening we have implemented two debug modes:

initNavigation({
  debug: true,
  visualDebug: true
});

The first one will output helpful console statements to understand what is going on when the Service is trying to focus the next element in the direction of navigation. Visual debug will draw borders around each focusable component, as well as highlight the points that are used to calculate the distance between components when navigating between them.

Epilogue

We are constantly improving our navigation system when we find new use cases or just trying to simplify things for developers even more. If you got inspired, check it out on Github and of course feel free to contribute!

Control your Remote Control, key or directional navigational engineering with code that is developed to suit all your universal needs!

The Norigin Spatial Navigation code is now available on Github on:

https://github.com/NoriginMedia/Norigin-Spatial-Navigation