Shared Data Loading Across Components

Avoiding duplicate network requests in web applications

Published in

Uncountable Engineering

9 min read12 hours ago

Web applications need to load data from the server to render content to the user. In a complex application, there are often multiple components in view at the same time that all depend on the same data. How can we write a modular app that promotes building independent, reusable components while avoiding loading the same data multiple times from the server?

This is a problem we needed to solve at Uncountable. We use React and Redux, with a variety of components spread over several pages. As we modularized and encapsulated behaviour, we walled off knowledge about what data a component needed to load. This was good, since it reduced the burden of using shared components. However, we had a problem: if two components both need the same data, how can we prevent it from being loaded twice?

To manage this, we created a data loader, which is a central system that can manage loading across apps.

We use Redux in our app, but this was more of a hindrance than a help. While it does provide a way to share data, it’s inimical to the idea of reusable-and-encapsulated components. The concept below is easy, if not easier, to implement with pure React. For more on state management using React, see my article on A mini-redux in React.

Example

Let’s start with an example to demonstrate the problem and identify the parts needed for the solution. Say we have an endpoint /api/get_user_data which retrieves some basic information about the user, including their name, team, and basic permissions. Now say we’re on a team dashboard page that shows several cards of information.

There are three different components that want the user information:

Navbar: This appears on every site page and contains a link to the team page, for which it needs to know the team, or at least its name.
Grid Layout: Some cards have privileged information and should not be shown for all users, therefore the grid layout needs to know the user’s permissions to decide which cards to display.
Profile Card: One of the cards shows a quick profile overview, including the user’s name, team, and whatever details are available from the get_user_data endpoint.

They each want access to the same user data. There are three approaches one could use to ensure each component gets access to this data:

Page Coordinated: The top-level page code, in this case, the team dashboard page needs to load all the data needed by the components of the page and provide it to them. This is hard to manage and breaks the encapsulation of components. You have to hard-code all the needed data at the top-level, outside of where it’s actually needed.
Uncoordinated: Every component simply loads their own data. It’s easy, but we end up with duplicate loads.
Data Manager: The approach I’m describing in this article. The team page doesn’t know what data is being loaded; each component requests its own data; the central data loader coordinates the load.

Keys and Loaders

In order to coordinate the loading of shared data we need to know which data should be shared, as well as have a way to load that data. Here’s some pseudo-code showing how a component would provide that information to the data manager.

const userData = dataManager.useLoadData( `user-data`, requestGetUserData )

Where requestGetUserData is a function, called a loader, that the data manager can use to load the data on demand. The data manager needs the loader since it has no former knowledge of any data; it’s a relatively pure component that has no knowledge of business logic. This was an architectural choice to ensure we could support segmented loading of apps, avoid monolithic feature maps, minimize code executed at load-time, and keep business logic out of the generic utilities. It’s easier to maintain this way.

The goal is that any component can make the same request to the data manager. The data manager will collect these requests, make a single call to the loader, and return the same data to all of them. I’ll get back to how it coordinates a bit later. In our React architecture, we’re using a hook function, thus the use… prefix. Until the data is loaded, this function returns undefined.

Our data loader hooks use React’s useState function. When the data manager has new data, it calls the listeners (described later), which call the state setter, which causes React to rerender the component.

That’s basically it. We now have a system where each component can independently request the data; the data manager collects those requests and makes a single call. Albeit, there are a few issues here, and a few details to cover.

Mismatched keys and loaders

What’s the problem if each component has to use the exact same code below?

dataManager.useLoadData( `user-data`, requestGetUserData )

It’s error prone.

If I make a typo on the key, I’ll still load the data but end up with an extra server request.

What if I accidentally use the wrong loader? This is the trouble case. The data manager expects all keys map to the same loader. If the different callers provide different loaders, it is undefined which loader is used. This can cause the wrong data being returned to all components.

The most straight-forward solution here is to use the loader itself as the key. The JavaScript Map type supports this. Our application uses dynamic loading though, so could we rely on functions being singletons? Interesting question, but actually not relevant in our case. There’s a weird history to our code that forced us to use an object, where functions aren’t valid keys. By the time a Map was possible, the data manager pattern was already widespread.

Nonetheless, there is a simple fix that works even if you can’t use the function as the key. Instead of using the key and loader directly, you can wrap them to create a loader object. This exact form didn’t actually exist in our code, as we jumped from ad-hoc solutions to a handful of system-specific solutions.

export const getUserDataLoader = {
	key: `user-data`,
	loader: requestGetUserData,
}

...

dataManager.useDataLoader( getUserDataLoader )

This doesn’t completely solve the problem. We could have two distinct loader objects using the same key by accident. We need to prevent that from happening. For data loaded directly from API endpoints, the simple answer is to use the endpoint string as the key. Instead of user-data, use /api/get-user-data. These are unique paths that won’t overlap with any other loaders.

Arguments

I’ve simplified the scenario by using an endpoint that doesn’t take any arguments. /api/get_user_data doesn’t have any options and always returns the same data for the given session.

Let’s instead consider the endpoint /api/get_team_data. Unlike the user data, we need to provide the id of the team. We can no longer define a single key for this data. Two different components loading different team data would clash on a singular team-data key.

So we added an arguments parameter to the data manager calls, and the loaders. The component requests data like below.

var teamData = dataManager.useDataLoader( getTeamDataLoader, { id: teamId } )

The second argument to the useDataLoader function is passed to the underlying loader, as well as used to construct a unique key.

To construct the final key, we JSON serialized the arguments and append it to the basic key. If teamId == 123 we’d end up with team-data:{"id":123}. Any change of the arguments results in a new key.

It’s important here to use a stable JSON serializer: one that sorts the keys. Otherwise, when there are multiple properties in the arguments structure, you could end up with different serializations of logically equivalent arguments. We actually went further, and for each API specify some array arguments that should also be sorted — they represent sets where the order doesn’t matter.

How does the data manager coordinate loading?

Now that we’ve seen how the components interact with the data manager, we’ll take a short look at what the data manager itself has to do.

The Loader

First, let’s see what the loader function signature looks like. Or rather, what a simplified and cleaned version could look like — ours is a bit marred with history.

type OnData<ResponseT> = (data: ResponseT) => void
type OnError = (response: any) => void
type LoaderFunction<ArgumentsT, ResponseT> = (
	arguments: ArgumentsT,
	onData: OnData<ResponseT>, 
	onError: OnError,
) => void

We’re using TypeScript generics to ensure that the components will get a type-correct view of the data they are loading. The useDataLoader function is also a generic that can infer the types from the loader. Internal to the data-manager, we treat the typed values as opaque objects. We still have a requirement that they must all be JSON compatible to use in redux.

The arguments are whatever was passed as arguments in the useDataLoader function.

As the loader is most likely making an endpoint call, the loading is asynchronous. The data manager can’t expect the data to be immediately available and provided in a return statement. Instead, an onData function is provided, which should be called with the data when it is available. An onError function is provided in case the data cannot be loaded.

We had, at one point, a loader that wasn’t asynchronous: it reported a static enum of values. This tripped up our data loader since it wasn’t expecting the onData function to be called prior to the function returning. An easy fix was to defer the onData via a setTimeout call — that’s easier than trying to handle this exceptional case in the data manager.

The Status

When a component calls useDataLoader, the data it wants, identified by the key, can be in one of these states:

Unloaded: This is the first request of data with this key.
Loading: The data is currently loading.
Loaded: The data is available now.
Error: The data failed to load.

The interesting case is the Loading case: why do we need it?

When a call comes in and the data is Loaded we can simply return the data. If the data is Unloaded, we can schedule a call to the loader and return undefined. If another call comes in now, and we left the state as Unloaded, then we’d schedule a second call to the loader. To avoid this, we switch the state to Loading the moment we schedule the first call. Any further calls while Loading will have undefined returned immediately, without scheduling another loader call.

In the error case, we return undefined. We do not attempt to reload the data, nor do we schedule another try later. Instead, the user will see the error, our log server will log the error, and we’ll fix whatever caused it. Keep in mind that this entire data state lives for only a single page in the app. An ephemeral error will be clear as soon as the user navigates elsewhere.

Reload

Our system evolved from rather simple beginnings, and has a bit of debt we’ll eventually clean up. One sore point was an explicit reload feature: if a component modifies the data, it should be able to tell the data manager to reload it. In theory, this could be a simple dataManager.reset( someLoader ) call, but due to some initial design choices, this was hard to implement. Just be aware you need this from the start and you won’t have a problem.

To work around our initial design choices, we ended up creating an additional load key argument to the load functions. The key doesn’t change what is loaded, but when changing key forces a reload. We used the name “version” for this load key, which is quite misleading. Each call to the data-manager has it’s own tracking, so we avoid any need to coordinate a proper “version”.

Reloading shared data actually doesn’t come up that much. The data manager only shares the limited amount of data that is commonly used across components. Aside from this shared data, each of our pages has its own set of primary data they are modifying. This is often a larger set of data and uses multiple API calls to load and save on demand. By limiting the data manager to a small set of mostly immutable data, we avoid needing to reload it. It’s the rare case where a page modifies data that might be shared and needs to inform the data manager of this.

Generalized Use

But doesn’t Product XYZ do this? Our data manager does feel like something that should have a packaged product, but is there one that fits into Redux and our app structure? I didn’t think it would involve much code, nor did I want to fall victim to invented here syndrome, so I wrote the data manager after only a brief search. It would be nice to generalize, and clean up, this library to publish. If not for re-use, then just for a full demonstration of the concept.

Having our own code let us go further than the abstractions I’ve shown here. We built API specifications into type-spec, which is our cross-language portable type system. Because we generate all the wrappers and setup automatically, most coders only see something like ApiGetUserData.dataLoader.useLoadAndSelect. Apart from the term dataLoader there, they won’t have any exposure to the data manager.

In this article, I’ve focused mostly on a 1:1 key -> api endpoint mapping, but the data manager is quite generic. We also use it for a generic naming service. In that system we merge the arguments from multiple callpoints into a single API call. But I’ll need another article to talk about this naming service.