Building the Unsplash Uploader

Oliver Joseph Ash
Unsplash Blog
Published in
11 min readNov 22, 2018

--

At Unsplash, we just released a brand new version of the photo uploader. The new uploader has a very simple design, but looks are deceiving! Under the hood, there are lots of different possible states and edge cases to consider. In this article, we’ll provide an overview of the technical architecture and discuss challenges we faced along the way.

As part of this technical deep dive, we are excited to announce that we are releasing a prototype version of the uploader as open source! This prototype version is very close to the one that is in production today, with the exception of some style and content changes. We hope it’s a useful case study to anyone who wants to build a similar application or wants to use similar technologies/patterns (TypeScript, React, Redux, redux-observable, and finite-state machines).

Technical architecture overview

The uploader is built as a finite-state machine (FSM), using Unionize to define the state and action sum types, Redux to define the state transitions, and redux-observable to perform side effects.

This FSM architecture enables us to write code that matches our mental model of a user’s journey through the interface. If you’re interested in learning more about FSMs and why they are useful, we recommend the following resources:

We never made a deliberate decision to use FSMs, rather we landed on this architecture accidentally. For some time we had been using tagged unions (aka sum types) to model various bits and pieces of application state, favouring them over simple booleans to improve code readability and eliminate impossible states. Thus when we came to build the uploader, it was a natural choice to model the state with tagged unions, and from there we stumbled into the FSM pattern. (Modelling state as a tagged union is a core component of the FSM pattern.)

Uploader FSM

In this section, we will illustrate the uploader FSM as a statechart, and then we will link this with the designs by walking through a user’s journey of the uploader.

The uploader application consists of three main stages, each of which is represented by a separate state in a FSM. In order, these states are Form, PublishingInProgress, and PublishingComplete. Here is what the top-level statechart for the application looks like:

Form, PublishingInProgress and PublishingComplete.

(Note 1: these statecharts were created using the XState visualiser, however we do not currently use XState to model our FSMs.)

(Note 2: these statecharts are not completely accurate, as they are simplified for the purposes of this article.)

The application begins in the Form state, with no files added.

From here the user can add files. For each file that is added to the form, we model its internal state with another FSM. We keep a list of these “form file states” inside of the Form state (shown above). Here is what the statechart for a file looks like:

FetchingDimensions, Validate, Invalid, and Valid.

The file FSM begins in the FetchingDimensions state. We fetch the dimensions by loading the image in the browser.

If we failed to load the file’s dimensions (e.g. because the user tried to add a corrupt image or a non-image file type), we won’t be able to use this file, so we transition into the Failure state (inside FetchingDimensions). The view uses this state to show an error message:

Otherwise, we move on to validate the file (which we now know is definitely an image). This involves checking the image is not too large and has sufficient megapixels, for example.

If a file is invalid, we won’t be able to use it, so we transition into the Invalid state. The view uses this state to show an error message:

Otherwise, we transition into the Valid state which means we can (finally!) begin to upload the file. Uploading involves a sequence of two requests: FetchingPresignedUrl (read more about S3 presigned URLs) and then UploadingToS3.

If any of these requests fail, they transition into their respective Failure states, where the user has an option of retrying them via the Retry action.

Note: we enforce a limit of 10 uploads per session. If the user tries to add more than this, files over the limit will not be added, and the user will see a warning.

Once all added files are uploaded (all files are in Valid UploadingToS3 Success state) and the user does not want to add any more, the user may submit the form.

When the form is submitted, the PublishFiles action is dispatched which causes the application to transition into the PublishingInProgress state. This is when we make a request to the API to instruct it to publish the files we previously uploaded to S3, which has the effect of making these photos visible on the user's profile.

Once all files are published, the Completed action is dispatched which causes the application to transition into the PublishingComplete state. Depending on the result of the publish requests, the PublishingComplete state will transition into either the AllSucceeded, SomeFailed, or AllFailed inner state.

AllSucceeded
SomeFailed
AllFailed

In the SomeFailed and AllFailed states, we give the user an option to retry publishing the failed images via the Rollback action which causes the application to transition back to the Form state.

Technical challenges

When we were building the uploader, we faced several technical challenges that did not have well established solutions, yet these challenges were not unique to this application. By sharing the details of these challenges, we hope that others may be able to benefit from our learnings — or maybe someone can point us to better solutions!

Modelling requests with Observables and RemoteData

Every time we make a request, we model the request state using a RemoteData type. This is a neat abstraction which lets us express a request as a FSM, so we can continue to reap the benefits of using FSMs all the way down our application's state.

You can see these RemoteData types used within the file statechart above. For example, observe how the UploadingToS3 state contains the inner states Loading, Failure and Success.

However, a request is never just a static value — rather, it changes over time. For example, it usually begins as NotAsked, then transitions to Loading, and then either transitions to Failure or Success.

To express values which may change over time, we use Observables. They also support two essential ingredients:

  1. Cancellation. When a file is removed, we can unsubscribe from its epic, which is the Observable that represents all side effects for a given file. This has the effect of aborting any pending HTTP requests which correspond to that file.
  2. Retries. As demonstrated earlier, we provide the user with an option to retry any failed requests (e.g. the request to upload a file to S3). Observables make this easy via the retry operator.

In order to use RemoteData inside Observable, we wrote a small helper function—ajaxUsingRemoteData—that will make a request (using RxJS' built in ajax helpers) and return the type Observable<RemoteData<FailureData, SuccessData>>:

  • When the request begins, the Observable will emit Loading.
  • Each time there is a ProgressEvent update for the request, the Observable will emit Loading along with the current progress.
  • When the request completes: if it failed, the Observable will emit Failure<FailureData>, else it will emit Success<SuccessData>.

We think this helper might have some legs beyond this application, whenever you need both Observable and RemoteData, so we are considering publishing it as a independent utility. If you're interested you can refer to the source here. Let us know what you think!

Performing side effects in a Redux finite-state machine

We’ve previously written about how Redux can be used to build a FSM. However, one critical piece we didn’t touch on in that article is how to incorporate side effects into this architecture.

To illustrate the problem, we’ll use a small example. Consider a simple FSM with the following state and action types (pseudocode):

Once we have fetched the image dimensions (action FetchedDimensions), we validate them and either transition into the Uploading or Invalid states:

Using reducers, we can easily describe what the next state should be when an action occurs. However, there is no clear way for us to trigger a side effect.

In this case, when we transition into the Uploading state, we want to trigger a side effect to start the upload request.

It’s interesting to consider how this problem is solved in other definitions and implementations of FSMs. Here’s the Erlang documentation’s definition of a FSM:

State(S) × Event(E) → Actions (A), State(S’)

If we are in state S and the event E occurs, we should perform the actions A and make a transition to the state S’.

This signature very closely resembles a reducer, which looks like:

The formal FSM terminology differs slightly: “actions” are called “events”. But much more significant is how this formal definition includes “actions” as something resulting from the reducer/transition function (in addition to the next state).

Under this definition, there are two types of “actions”: events (e.g. a button click), which are fed into the reducer, and actions resulting from the reducer (e.g. an instruction to perform a HTTP request), which are fed into a function to perform the side effects.

What if we could build on top of reducers to support this pattern?

The actions resulting from the reducer would then be fed into another function where they could be performed. For example, they could be done in an epic:

This is not a new idea! This is similar to how side effects work in Elm, and some plugins already exist to bring a similar pattern to Redux.

We ultimately decided against using this pattern because it’s rarely seen or used in the Redux ecosystem, but we’re excited to see if this idea increases in popularity as more people begin to use Redux to model FSMs.

The way we ended up solving this instead was by listening to the state changes in our epics and inferring when a new state was entered.

There are other ways of implementing FSMs outside of the Redux ecosystem which do have solutions to this problem, but unfortunately we didn’t have time to fully consider these options. However, we were pleased with the progress we were able to make using our existing tools.

If you’re interested in the ideas discussed here, we started a discussion which we’d love for you to join!

Epics for dynamic list state

One problem we ran into when using redux-observable is there appeared to be no clearly established pattern for working with dynamic list state. In a dynamic list, each list item has its own local state and set of side effects. In our case, we have a dynamic list of file states, each of which has its own side effects such as the HTTP request to upload a file to S3. We can easily define an epic for any unit of state, but how exactly do we run an instance of this epic for each item in our dynamic list?

To illustrate the problem, we’ll use a small example. Consider an application that displays a list of counters, with the option to add a new counter or remove an existing one.

Each counter in the list has its own state:

The application state will be modelled as a list of our CounterStates.

In order to do the actual counting, each counter in the list will need a timer. We can define this with an epic:

We have an epic which we want to use for each item in the list (in this example, each counter), but how do we actually run this epic? Specifically we need the following behaviour:

  • When a list item is added, we need to subscribe to a new epic for the given list item.
  • When a list item is removed, we need to unsubscribe from any existing epics for the given list item in order to abort any work, such as cancelling the interval in this example.
  • Ideally, the list item epic should receive its local state as an observable, rather than the global state$. (Although sometimes child components need to have access to parent state, so there should be some level of control over what state gets passed in.)

The way we initially achieved this was by creating “added” and “removed” actions for the list items, which are dispatched when a component corresponding to the list item state mounts and unmounts, respectively. We can then listen to these actions in our epics to manage the lifecycle of each list item’s epic:

(See full example.)

This meets all the requirements, but there is a lot of ceremony involved, because we must define, dispatch, and handle the added/removed actions. These events are already provided by the state in the form of our state’s array/object additions/deletions — wouldn’t it be nice if we could just use that?

The way we ended up solving this instead was with a small helper function — runListEpics—that neatly handles all of our requirements:

(See full example.)

Thanks to this runListEpics helper, there is much less wiring and boilerplate. Instead of using actions to control the lifecycle of the epic, the epic lifecycle "just works" according to the state:

  • When an item is added to the list, a new epic will be instantiated and subscribed to.
  • When an item is removed from the list, the epic will be unsubscribed.

We won’t go into detail here how exactly runListEpics is implemented, but if you're interested you can refer to the source.

We think this helper might have some legs beyond this application, so we are considering publishing it as a independent utility. Let us know what you think! If you’re interested in the ideas discussed here, we started a discussion which we’d love for you to join!

Delayed state transitions

Sometimes it is necessary to artificially delay a transition from one state to another. In our case, this was necessary for the transition from the PublishingInProgress state to the PublishingComplete state. On a fast internet connection, the requests to publish each of the files—which happen as part of the PublishingInProgress state—are likely to complete very quickly. In this case, we don't want to immediately transition to the PublishingComplete state, in order to give the user a chance to observe the PublishingInProgress state and avoid the feeling of a jumpy UI. Thus, we need to artificially delay the transition to the PublishingComplete state.

We were already using an epic to define when the transition should happen (when all of the requests have completed), so it was easy add this delay using some RxJS magic. For example:

You can see this in action by referring to the source code here.

Thanks to Sami, Seb, David, and Charles for their feedback and input on this article.

Don’t forget to check out the open source uploader prototype.

If you like how we do things at Unsplash, consider joining us!

--

--