Blogpost

Background
This year we built Grow Gifts — a site for sending gifts to our clients and friends in the form of swag, including apparel (t-shirts, hoodies, etc.) and pillows. The way it works is: we text someone a unique link to the site (using an identifying hash), and they visit the site to pick out their gifts and confirm their mailing address.

We also wanted to use the exchange as an opportunity to introduce the broader Grow team and the places we love in Norfolk. Our friend Echard photographed each member of the team in Grow swag hanging out at their favorite spots around the city — 280 photos in all. These photos are the focus of the mobile experience, which we wanted to be app-like in its performance and functionality. Each photo is shown full-screen within the site, and by swiping — the most familiar native gesture — users are taken between photos. The experience is structured in a T-shape: swipe left or right to see different swag items on different team members, or down to see more information about the current item.

If you look closely you’ll see that the images are actually rotating in 3D as they move. To echo the attention we paid to the physical packaging of our swag, we wanted the screens within the site to feel like the faces of a box.
The Challenge
The technical challenge is this: how can we performantly (maintaining the 3D rotation effect) slide hundreds of full-screen images around on a mobile browser?
We knew it would never work to simply line up all the photos along a huge strip of DOM and move them in response to swiping gestures. To achieve a smooth 3D rotation and translation effect we would need to promote each box “face” to its own composited layer, and if we did that for every image we’d quickly run out of memory on a mobile device. On top of that, the site’s interaction model would be unusually demanding on the GPU: a user can swipe multiple times per second, and each swipe requires a complete repaint of the screen.

The Solution
We came across the solution in this blogpost. The trick is to make the DOM as small as possible to support swiping actions in any direction, which means having photo panes templated above, to the right, below, and to the left of the center image, but no more than that. Let’s say the user is currently viewing a black t-shirt and swipes to the left, revealing a red t-shirt on the right. As soon as the black t-shirt image is completely displaced by the red t-shirt image, we rewrite the DOM so that the red t-shirt is now in the center of the DOM and a new t-shirt image is to its right. The black t-shirt image, formerly at the center, is now to the left of the viewport. Up/down swipes work the same way. The animation below shows how this works.

The technique is similar to the one used in native iOS scrolling views in which only a few cells above or below the viewport are actually rendered, and the resources for those views are swapped as the user requests more cells in either direction.
In our first prototype we blocked all touch events from the moment the user’s finger leaves the screen to initiate the swipe, until the rewriting of the DOM was complete. This felt like the right way to ensure that the crucial rewriting process wouldn’t be interrupted. Unfortunately the result never felt responsive enough. We tried all sorts of things to get around this, such as queuing swipe events and speeding up the swipe animation. But it was like we were asymptotically approaching native-level responsiveness — those things improved the experience, but the dropped events were still completely obvious to the user. We finally realized we would need to respond to touch events at all times.
In order to do this, we’d need the retemplating process to be as fast as possible. This is because a user could swipe to the left, and then before the photo panes settled into their new positions, the user could swipe to the left again and expect to see a new image immediately. We managed to get the process down to ~50ms.

We did this by repeatedly profiling, identifying bottlenecks, and optimizing: batching DOM updates, using loops instead of forEach, minimizing memory allocation during retemplating (to avoid triggering inopportune garbage collection events), etc. The end result ended up being fast enough that most of the time there was no perceptible lag.
Ideas for Improvement
As one can see from the timeline screenshot above, despite our optimizations 22ms of the retemplating process was spent on scripting. That’s 22ms that the UI thread could have spent rendering/painting. This article describes an experimental technique that that gets around this problem: moving all scripting not related to rendering or animation to a web worker. It would be interesting to see whether the offloaded effort is worth the overhead of the worker and the time required for communication with it.
Another way to lessen the speed requirement on the retemplating process would be to render more than a single photo pane on either side of the viewport. So for example, if we rendered two photo panes on either side of the viewport, then the retemplating would only have to keep up with every other swipe gesture. Of course, the downside would be maintaining a larger DOM, and it would take benchmarking to determine whether the trade-off is worthwhile.