What even is “navigation”?

Matt Carroll
Super Declarative!
Published in
9 min readJul 5, 2018

--

The following thoughts on navigation are my own personal musings and do not constitute an official Flutter perspective on navigation. Please refer to official Flutter documentation for information about how the Flutter team thinks about navigation.

Navigation is one of those terms where the less you think about, the more obvious its definition. But I’ve thought a lot about navigation within mobile apps over the past few years. Every day I’m less sure what “navigation” really means. This uncertainty leads to confusion about how to use various navigation systems/libraries because I’m not really sure what the authors of those systems mean by “navigation,” either. I’d like to work through my thoughts on navigation by rambling to all of you for a few minutes. Maybe this will start a conversation.

Atomic Navigation

Navigation in the early days of mobile, much like the early days of web, was straightforward. An app was comprised of screens. The user was either viewing Screen X or Screen Y. Changing from Screen X to Screen Y was “navigation.” The graph of all screen transitions comprised the “navigation hierarchy” of the app. Easy.

The early, basic notion of navigation was easily facilitated by Activitys and Fragments on Android, and ViewControllers on iOS.

Eventually someone decided to take a photo on Screen X and persist it through navigation to Screen Y. Uh oh! What does this mean?! No worries, we can hack some kind of “transition system” to hold onto this “hero” element from one Activity / Fragment to another. Problem solved.

Example of a hero transition.

But this was a crack in the foundation. This was the moment that astronomers realized their Earth centric model of the solar system wouldn’t quite work, no matter how complicated their equations became. This was the moment that Newton’s elegant equations of motion began to fail experimentation. This was the beginning of Quantum Navigation.

Quantum Navigation

If a photo can be shared between Screen X and Screen Y, why not a card? If a card can be shared, why not a list of cards? If some things can be shared, why not all the things?

Example of transitions that blend the ideas of “navigation” and “animation.”

But now we have a real condundrum.

If 9/10 items on the screen remain unchanged, but the 10th item slides out and is replaced by a new item, is that navigation, or is that simply an animation?

What if it’s 50/50 instead of 90/10? Is that navigation?

Where is the line?

Is there a line?

Deep Linking and Navigation Recovery

It’s also important to never forget about deep linking behavior and navigation recovery.

Deep linking means starting an app with the notion of having already navigated somewhere. For example, maybe you launch an app and immediately take the user to his/her profile screen instead of dropping him/her at the typical start screen.

Deep linking does not merely render a different screen — it creates an implied “back stack.” If the user is immediately taken to a profile screen, then pressing the “back” button shouldn’t exit the app. It should instead take the user to the regular start screen. Even though the user never manually navigated from the start screen to the profile screen, a virtual navigation history was created anyway.

Navigation recovery is a similar issue. This issue is especially critical to a system like Flutter where navigation can take place within a single View. Imagine that you’re displaying a Flutter app within a single Activity. The user has navigated from “card feed” > “card” > “photo”. Then, the user leaves the app. The app is destroyed because Android needs to reclaim resources. The user returns to the app and it’s rebuilt. But now the Flutter View is brand new again — it doesn’t remember where the user was or what the user was doing. Somehow the Flutter app needs to retain knowledge of the previous navigation history, including any relevant user/business data that was collected along the way (imagine a sign-up flow, or two factor authentication flow).

Example of a sign-up flow with 2FA.

But it gets worse!

Not only do the above requirements for deep linking and navigation recovery exist, but so does access control. A user’s authentication/authorization can disappear at any moment. Even if User A had access to Screen X a minute ago, it doesn’t mean User A has access to that screen now. Furthermore, some apps have multiple levels of access control which are typically managed by Access Control Lists (ACLs). Thus, the navigation graph isn’t simply a static graph, it’s now a dynamic graph that adds/removes nodes based on the user’s current access level. And that access level can change at any moment in time.

Where do we bring together the responsibilities of virtual navigation history, collected information, and current access control? More specifically, what does one do when they are restoring a virtual navigation history but suddenly the app is missing some necessary collected information, or now the user’s authentication token has expired and the user no longer has access to all the screens in the navigation history?

There is some amount of responsibility in this area that must be handled on a per-application basis. But that responsibility is probably a lot less than what we currently expect each app developer to figure out. Moreover, I don’t think mainstream navigation systems come close to recognizing the inherent complexity of real world navigation. Most frameworks still seem to think in terms of discrete screens, or rectangles within a screen that switch themselves like some kind of inner discrete screen (e.g., tabbed navigation).

Embedded Navigation

Navigation is not simply one continuous path. The common use of tabbed navigation in iOS and Android demonstrates a use of “embedded navigation.”

First the user selects a tab (or a tab is auto-selected upon launch). This is one level of navigation all it’s own. Once the user is viewing a tab, any navigation within that tab is saved as a back stack for that specific tab. You can try this navigation in the YouTube app for iOS right now. When you switch tabs you don’t lose your per-tab navigation history. Thus, you have multiple navigation histories persisting at the same time. Every one of those back stacks must answer to the graph restrictions mentioned earlier. And every one of those back stacks need to be persisted across app destruction and recreation.

Visualizing Navigation

I think our industry needs a strong mental model for what navigation looks like, conceptually.

Compile-time navigation (the conceptual navigation options that a developer sets up in code) has long been discussed as a graph and I think that metaphor still works.

We also need to discuss runtime navigation (the actual navigation history of a user at runtime). Runtime navigation is often a stack or a tree, but it is also conceivable that a user navigates forward to a screen that is already in the back stack and thereby creates a loop, thus, we’re dealing with a graph for runtime navigation, too.

Here are conceptions of those graphs:

Navigable Graph

The graph of all app transitions permitted for a user at a certain access level (or you can combine all access levels into one graph and denote the permissions per node).

Example of a Navigable Graph

The Navigable Graph also contains requirements at certain nodes that certain information is available. For example, during a sign up flow, it is expected that by Screen 3, the user has provided all necessary information for Screen 1 and Screen 2. It may be illegal to view Screen 3 without having first collected Screen 1 and Screen 2’s information.

Navigation Graph

The runtime graph of all screens the user has visited.

Example of a Navigation Graph sitting atop a Navigable Graph

This graph is most likely a tree, but can in some cases be a legitimate graph.

Navigation Graph nodes may contain information collected by their associated UI. Associating collected information with a UI state allows the navigation system to automatically re-hydrate that UI if the user navigates back to that UI. For example, consider a form that takes in a user’s address, then the user navigates forward, and then navigates back — the address form needs to rehydrate with the information the user previously entered.

Renderable Graph vs Navigable Graph

Here’s an idea that I haven’t heard mentioned before. Often we think about rendering as an ephemeral and constantly changing process. But we also know that every moment of time corresponds to some kind of application state.

In other words, even as animations rapidly generate 60 frames per second, every single one of those frames represents a unique and discrete rendering state. Moreover, since the direction of time is constant, those render states have a specific order. We can logically connect every render frame to the next, and now we have a stack, AKA a defunct tree, which then qualifies as a defunct graph. Hence, there is a “render graph,” as well.

If we consider the set of all possible renderings of a UI (all the different animations and layouts), we end up with a legitimate graph. Let’s call it the Renderable Graph.

Example of a Renderable Graph

The Renderable Graph is a superset of the Navigable Graph. In other words, the Renderable Graph contains the Navigable Graph, as well as millions of additional nodes.

The Navigation Graph is a subset of the Navigable Graph, which is a subset of the Renderable Graph.

This notion of a relationship between a Renderable Graph and a Navigable Graph then leads to a statement about what “navigation” really is.

First, let’s say that a “route” is an “addressable” node in the Renderable Graph.

At runtime a “visited route” is a “route” that the user has visited, either explicitly or virtually, and may contain metadata related to the user journey.

Then, “navigation” is the runtime access to routes, enforcement of route constraints, and retention of visited route history.

Both the definition for “route” and “navigation” are extremely generic, but that’s because navigation turns out to be a very generic concept. We simply can’t think in terms of screens, or boxes, or tabs in today’s world of dynamic user interfaces. Instead we need to recognize the complexity of the Renderable Graph and then provide machinery to build a Navigable Graph atop the Renderable Graph, and then also provide machinery to enforce the Navigable Graph at runtime by building and maintaining the Navigation Graph.

Application Graph Theory

Why all the fuss about these graphs?

I think the conceptual relationship between the Renderable Graph and the Navigable Graph is especially significant in the world of Flutter, because Flutter builds UIs declaratively.

In other words, the way that we build UIs in Flutter is by essentially declaring the Renderable Graph.

Think about how we apply animations in Flutter:

AnimationController animation = ...;...Widget build(Context context) {
return Opacity(
opacity: animation.value, // opacity at a moment in time
child: Container(
width: 50.0,
height: 50.0,
color: Colors.red,
),
);
}

In the above example we literally declare how to render the UI at a moment in time. We declare how Flutter should configure a single node in that massive Renderable Graph. Therefore, I think Flutter provides a UI model that allows us to speak explicitly about the Renderable Graph.

So maybe we can leverage that Renderable Graph to define a Navigable Graph?

Ok, but why do we care about the Navigable Graph?

If we can explicitly configure navigation with a Navigable Graph then it means we can treat all navigation concerns in a uniform manner. Thus, a framework/library can handle back stacks (and back trees) of any complexity, deep linking, navigation history restoration, and immensely complex visual transitions. All a developer has to do is plugin his/her application traversal decisions in the right places to configure this graph.

A uniform navigation treatment might sound like a pipe dream, but should it? Flutter has successfully demonstrated a declarative solution for rendering pretty much any UI. In fact, I recently wrote a post about Flutter’s declarative widget system along those lines. So why can’t we do the same thing with navigation? Does navigation really need to be the mess of tangled wires that it has become in traditional Android and iOS? Do we really need to stick 90% of navigation responsibility on individual app developers?

My gut tells me “no.” I have a feeling that there is a uniform solution out there, at least for declarative UI systems like Flutter (and perhaps React).

I think we should try to find it.

--

--