Building Tools To Impart Order in a Disorderly World

Gurdas Nijor
OpenGov Developers
Published in
6 min readJul 14, 2016

As a species, we humans spend a lot of time classifying and deconstructing the world around us: Breaking it down into its constituent parts, finding relations between those parts, and connecting them to each other.

Photos by (clockwise from top left) Yanko Peyankov, Bodie Pyndus, Michael Bonfiglio, Roger Thomas, Maarten van den Heuvel

The world of financial accounting is no different. Our drive to impart meaning to our natural world applies equally to the financial structures that we’ve created to describe the internal workings of our governments, institutions, businesses and the functions therein.

In accounting, this fundamental structure is what’s known as a Chart of Accounts. For the governments we serve, it’s effectively a map of their organizational structure, with different hierarchies describing a particular facet of the organization (whether it’s a breakdown of departments, funds where revenue is sourced from, or expenses.)

The operational complexity of the modern government makes the management of their Chart of Accounts a difficult endeavor. With the constant reorganization, consolidation, and expansion of different functions through time, it becomes problematic to get a clear picture of the financial health of the different components of an organization. From a Technical and User Experience perspective, building tools to support the scale of the data and its constant churn becomes a challenge.

Engineering a Solution

The products we build at OpenGov put a primary focus on the user and her ability to manipulate, interact with, and visualize data in a highly performant way with a minimal amount of friction. To achieve this goal in the face of large amounts of data requires some novel approaches, along with transformative technologies. Those technologies include what is seen as a “modern” front-end stack with React and Redux playing a large role.

React has enabled us to scale along with allowing us to finely tune the performance of data-intensive user interfaces. To give some idea of why this is important, an operationally complex city can have a Chart of Accounts (which we model as tree data structures) that ranges in the tens of thousands of nodes and can join to potentially millions of records of data.

Another technology that has played a key part in the development of a solution for building a robust editor for a Chart of Accounts is Immutable.js, an immutable collections library that describes its benefits right on the box:

Immutable data cannot be changed once created, leading to much simpler application development, no defensive copying, and enabling advanced memoization and change detection techniques with simple logic. Persistent data presents a mutative API which does not update the data in-place, but instead always yields new updated data.

A number of features emerge from the properties of its persistent data structures:

  • Performance
    Cheap equality checking means less wasted cycles when data hasn’t actually changed between renders
  • Predictability
    Lack of mutation means data won’t change in unexpected ways
  • Multi-level Undo/Redo
    Entire versions of a large data structure can be kept in memory (with minimal overhead) because of structural sharing

I find that the last property tends to be the least exploited aspect of using persistent data structures, despite the fact that it delivers a high amount of product value.

The Art of Time Travel

User expectations for web applications have grown tremendously. Many features that exist in desktop applications are expected to be present in web based counterparts. One of those critical features in applications that manipulate complex documents with sophisticated operations (Digital design/CAD/Office software) is multi-level undo/redo.

Something that was once difficult to implement correctly becomes nearly trivial when using persistent data structures: simply append an element representing a snapshot of your application’s state to the end of an array every time it changes. Structural sharing of large chunks of data across elements ensures a drastically lower memory footprint than cloning it after every edit.

At any time, a user is free to jump to an arbitrary place in the edit history and to resume making changes from that particular place in time, or saving.

Which brings up the topic of persistence: traveling back and forth through local edits is fun; but at some point, we’re going to want to share our current state with the rest of the world.

Synchronizing State

An approach that immediately comes to mind is to take a resource-oriented (or RESTful) stance and decompose the features of a Chart of Accounts tree into independent resources to be modified. An update to a node (“Public Safety”, for instance) under a tree (“Departments”) could happen with a request made to a url that looks like the following:

/api/v1/chart_of_accounts/{chart}/{departments}/{public_safety}

(Where chart, departments and public_safety can be unique ids that map to the corresponding resource)

Outside some of the subtleties of which HTTP verb to use and the specific shape of the request payload, the problem that most clients will run into is difficulty in constructing more complex operations that can be composed of updates to multiple resources in a performant way. Many expensive HTTP requests may have to be coordinated to express basic operations like moving nodes, adding/deleting groups of nodes, creating additional levels of hierarchy in a tree, etc.. all without any transactional guarantees.

These complex update requirements lead many APIs down the path of abandoning the notion of independently addressable resources for the simpler (and safer) approach of ad-hoc objects that define a batch of operations to be applied within the scope of a single transaction.

{
created_nodes: [{id:1, name: 'Fire trucks'}, ...],
updated_nodes: [{id: 2, name: 'School supplies'}, ...],
...
}

Taking this even further, we see that an optimal update API for complex documents deals not with firing off requests to independently addressable resources, but instead by expressing the delta between local and remote state, and applying that “patch” to the remote state to bring it in sync with what is observed locally (similar in spirit to how Git works.)

Luckily, a standard exists for expressing these deltas. Here’s how the update request above would look if formatted as according to the JSON Patch standard:

[{ 
op: "add",
path: "/expenses",
value: {
id:1,
name: 'Fire trucks'
}
},
{
op: "replace",
path: "/expenses/2/name",
value: {
id:1,
name: 'School supplies'
}
}]

A number of libraries exist for generating these deltas when given two immutable data structures as inputs, such as immutable-js-diff. It doesn’t take much of a leap to see how this could be used with the technique for multi-level undo/redo discussed in the previous section to produce “changesets” between any two state snapshots that can be used to bring a remote state store up to date with a local one:

Using a strategy described above not only allows the ability to easily undo/redo changes; but provides a strategy for versioning (by actually saving the deltas and using them to “recover” a state at any given time similar to a database transaction log) and even opens the door for realtime collaborative editing using a differential sync mechanism.

By leveraging immutability and expressing updates in differential terms, we’re able to build systems with powerful properties that further realize our ambitions of imparting order for a disorderly world.

--

--