Box Drive — Engineering faster starts and less memory
Box Drive 2.0 introduced a wonderful new feature called MFO, or mark for offline access, which is detailed in an earlier blog post. Box Drive is Box’s flagship Desktop product giving one complete access to their Box.com cloud file system on the desktop inside File Explorer (Windows) or Finder (Mac). As it turned out work on the original Box Drive 1.0 started long before the Box Drive project was even conceived. We didn’t know it at the time, but the central synchronization architecture of Box’s earlier desktop product (Sync4) would eventually evolve into a central pillar in Box Drive. With Box Drive 2.0 shipping I felt it was time to talk a little about this synchronization architecture, called the engine. This will be framed in a discussion of the VLLIS, a new component introduced in Box Drive 2.1.
The VLLIS is a not a Customer feature a la MFO, but rather tweaks the existing architecture in a way that achieves improved startup time and reduced memory usage. Understanding how the VLLIS is so impactful begins with understanding its LIS (Logical Item Shadow) suffix and learning a bit about the engine. Here’s a 10,000 ft overview of the engine:
The 2 large/orange triangles are LIS instances. Each LIS resides within what’s called a Monitor (the plain black rectangle). There’s a Box side Monitor, with a Box LIS. And a Local side Monitor, with a Local LIS. You might note a smaller orange triangle in the Box Monitor. This is also a type of Shadow. A Shadow is basically an in-memory representation of a “file tree”, containing the metadata for all the files & folders in that tree. It takes RAM. The bigger the tree the more the memory. The bigger the tree, the longer it takes to initialize at startup.
Everything in the engine architecture strives to have a clear contract — the purpose of that abstraction, what it expects as inputs and and what it produces as outputs. Here’s a one sentence description of the state maintained by a LIS:
The LIS is the engine’s best understanding of what exists on the corresponding real file system
That’s a pretty simple concept. Consider the Box side… if you’re a Box customer (and we all should be!) you have an All Files page on Box.com that shows your Box.com cloud file system. Box Drive, at any point in time can’t know exactly what exists in the cloud. The cloud can change at any time, independent of the Box LIS being updated.
Looking back at the architecture diagram once again, notice that there are 2 flows, one with a purple sequence of arrows (flowing from the local side to Box). And a green flow, starting on the Box side and flowing to local. These arrows represent how “changes” are propagated from one side (the “source” side) to the other (aka “target” or “opposite”) side.
Notice the 4 bolded arrow segments. These are segments that potentially alter the state of the LIS. The segments tickling the top of the triangle represents the following logic:
if source_LIS.is_a_change(_input_):
# input represents a change/difference
source_LIS.update_state(_input_)
continue_along_purple_path(_input_)
else:
drop(_input_) # Since it's not a change, it can be dropped
This logic happens at near the beginning of the data flow. Why it exists is pretty obvious given the one sentence description of the LIS. The engine just learned that some item on the “source” side has a new state X. So the LIS should be updated to reflect that state.
The arrow segment coming to the LIS from the bottom, near the end of the flow, represents this logic:
# Arrow pointing at the real file system (Local or Box.com)
target_file_system.apply_change(_change_)# Arrow pointing up at the LIS
target_LIS.update_state(_change_)
Since the engine knows that it just applied change to the target file system, the corresponding LIS ought to be updated with change, because after all, the LIS supposed to be our best understanding of that file system’s state.
At a high level, and glossing over various details, the combination of these 2 bits of logic allows the engine to identify (and drop) what we call echoback notifications. When a change is made on a file system, the engine has to detect that change in order to apply that change to the other (target) side. But the engine doesn’t want to respond to changes that it makes. Here’s an example:
- On local, file A is renamed to B
- That change is detected by the engine, starting the purple flow, ending with:
○ apply the change to Box.com
○ update the Box side LIS with the new state (name: ‘B’) of that item - When Box.com gets a change it sends a notification to all interested clients.
- The engine (an interested client) gets a notification: “A was renamed to B”
- This starts the green flow
- alas this notification will not look like a change because the Box LIS already has the new name for that item!
Early on in the development of Box Drive an interesting new reality came into focus as we considered 3 aspects of the design:
- One of the components of Box Drive is the file system. Box Drive creates a virtual file system that gets hooked into the operating system. As such this file system appears on the desktop just like your current files and folders. It’s where you can access all your Box.com cloud content.
- The file system, by design, does not generate echoback events. When the engine applies changes that originated on Box.com to the Box Drive owned file system (the last green arrow), those changes do not trigger a notification to the engine (the first purple arrow).
- Box Drive has access to perfect information on the state of that file system because Box Drive is the file system! And since echobacks aren’t generated (item 2. above) there was no need for a Local LIS.
Items 1 & 2 have been part of Box Drive since it’s first release. Item 3 however was a dream unrealized, until Box Drive 2.1 that is. As described earlier, those pesky Shadows (triangles) are expensive abstractions in both startup-time and memory, and after a bunch of work the 2.1 release of Box Drive the Local side LIS was replaced by a new abstraction — the VLLIS, or Virtual Local Logical Item Shadow. The Virtual aspects means it doesn’t take any memory and doesn’t cost any time to initialize.
Simple performance tests on a file system with ~200K files/folders shows the following benefits:
The story of how the VLLIS was created and shimmed into the existing architecture is quite interesting mainly due to a design mistake that was made in the original architecture back in 2013. Undoing that mistake was quite the exercise in engineering and will be detailed in a subsequent post.
Interested in joining Box? Check out our open positions.