Erasure

Published in

strava-engineering

9 min readAug 31, 2018

© Kristin Nador licensed under Creative Commons 2.0

On May 25th, 2018 the European General Data Protection Regulation (GDPR) went into effect. It enacted a sweeping set of new regulations intended to provide European citizens better control over their own data. Of concern here is one particular article of GDPR: the right of Erasure. It states:

The data subject shall have the right to obtain from the controller the erasure of personal data concerning him or her without undue delay and the controller shall have the obligation to erase personal data without undue delay [..]

In more direct language, and with some notable exceptions, the article provides that European users may request and have executed the permanent deletion of their data.

This blog post presents a high level overview of the system Strava implemented to provide our customers with that functionality.

Background

Like many larger companies, Strava stores data in a variety of technologies and formats. As a sampling, we have:

Relational data stored in various MySQL databases and tables.
Columnar data stored in various Cassandra clusters and tables.
Key/Value data stored in S3, and various Redis servers and clusters.
Streaming data stored in Kafka, SQS, Redis, and other similar technologies.
Relational and columnar data stored in our data warehouse.

When a customer asks to delete data, we need to be able to find and erase their data across all these systems within 30 days. Additionally, like a lot of other internet/data companies, most of our systems had been carefully designed and tuned to handle a write load. Much less historic thought and capability has been put towards how we would delete the data.

Research

To add deletion capability, a small handful of engineers at Strava spent time cataloging all of our data systems. We had to figure not only

what data we had, but
if it was possible to easily and efficiently delete data from these systems for a specific user.

I wish I could say we did this via some cool programmatic technique or something less tedious than manually examining all the data stores we use and reading lots of code, but that’s what we had to do. Sometimes programming is not very glamorous.

Deletion Capability

Data at Strava generally fell into one of three buckets in terms of its compatibility of allowing deletion for a specific user’s data:

Functionality already existed to delete data by a particular user.
Functionality didn’t exist to delete data by a particular user, but the data was stored in such a way that it was possible to add that capability relatively easily.
Data was stored in a way that made it computationally expensive to delete a particular user’s data.

Bucket 1 is already done for us, so no work to do there. We felt comfortable adding capability where needed for bucket 2 as most of that work was relatively easy and well scoped. For bucket 3, our general strategy was to utilize data store functionality to add a 30 day TTL to any particular record. This means that all records, not just ones for a particular user, were deleted automatically by the data store after 30 days.

Erasure the Service

As we began work to add deletion capability to services, in parallel we started work on a system to coordinate execution of the deletion of data. After some discussion, internal design sessions, and architecture reviews, we settled on creating a new “deletion coordination” service, named Erasure.

Overview

Erasure is a logical service made up of a few smaller apps. The main app is a Finagle Thrift RPC server gating access to internal state stored in MySQL. Orbiting the Thrift server are three stream processing components providing background processing, coordination, and failure recovery mechanisms.

The following sections elaborate on the design and architecture of the Erasure service, going over different components.

Modeling deletes

The user data we need to delete is across a wide variety of access patterns. Some data sets are fronted by a service, which gates all access to the data. Other data sets have no service, and so we’d access the data store directly. Some data is normalized and indexed on user ID, such as in a MySQL table, while others are denormalized or provide no real indexing useful to gain access to a particular user’s data.

Given this broad and varied data access, Erasure needs to be abstract and extensible, affording deletion operations flexibility in their behavior and implementation. Consequently, Erasure is not prescriptive in how deletes are run. More, it is a runtime for the individual deletion tasks, with coordination functionality to ensure deletions are scheduled, run, and tracked correctly.

Command Providers and Processors

All deletion actions run by Erasure are modeled into an abstract structure called a Command. For any given user who is having their data deleted, Erasure runs dozens of commands to delete data from various data stores.

When an athlete asks to be deleted, commands are produced byCommandProvider instances. Each data set at Strava which needs to be deleted has a CommandProvider registered inside Erasure, and Erasure will ask the provider to generate commands every time a user asks to be deleted. A data set’s commands are then run by the data set’sCommandProcessor, which contains the actual runtime logic to execute the command.

Command Guielines

There is no hard-and-fast rule for determining what makes up a command or how to process it. We wanted Erasure to be flexible in what kinds of actions it runs for deletion, so a Command is very freeform. Erasure mostly treats commands as a runnable unit which is run until completion.

That said, we did end up with some general principles and command processing guarantees that we found helped keep commands sane:

Commands should expect at-least-once processing, and thus must be idempotent.
Only one worker at a time will process a given instance of a command, but many commands may be processed in parallel.
Commands should expect to be aggressively retried until completed.
Commands should prefer deleting most/all of the data in a single large request to deleting a portion of the data with multiple smaller requests.
Commands should prefer making partial progress, instead of no progress, in failure scenarios.
Commands should aim to finish within a short amount of time, ideally just a few minutes or less.
Commands should expect to be interrupted. Erasure will interrupt any commands lasting longer than an hour, and schedule them for retry.

Command Dependencies

For any given user, we cannot simply make a big list of all commands we need to run and just run all of them in parallel. Amongst commands, there are dependency relationships and ordering concerns. For example, some data deletion commands are not indexed by user ID, but rather activity ID. And so may need to lookup all the user’s activities first, in order to delete their data by activity. If activities were deleted first, we would lose this capability.

We also cannot run all deletion commands serially. Deleting all of a user’s data requires executing logic across many services, each of which may itself require the deletion of tens, hundreds, or even thousands entities. So we must process at least part of the commands in parallel to achieve expedient deletion.

To model these relationships, Erasure models deletions as a tree of commands, internally called a “Work Tree.”

Each Command is rooted in the tree at a particular path in the tree. The path resembles a filesystem path, and uses “parent folders” to define which dependencies a command has in order to execute. Any number of commands may be rooted to a particular path.

For example, in the following scenario:

/a   -> Command_A
/b   -> Command_B
/b/c -> Command_C
/b/d -> Command_D

Command_C and Command_D are children of the Command_B. This denotes the fact that all commands rooted at /b in the tree cannot run until all their children have completed.

Once Command_C and Command_D have completed, then Command_B can run. Command_A is free to run immediately, as it has no children.

Deletion Processing Model

Erasure deletion kicks off with a call to its Thrift endpoint:

ForgetResponse forget(1: i32 user_id)

This endpoint builds the work tree of Command instances for the particular user, and materializes the tree in MySQL. All future processing of the tree is then done by two downstream workers.

Tree worker — only one instance exists. Responsible for querying current work tree state, determining processable commands from a work tree, and enqueuing them for processing.
Command worker — multiple instances. Dequeues and processes commands enqueued by the tree worker.

Once forget() has materialized the tree to MySQL, it enqueues the tree ID into a queue for processing by the tree worker. That worker parses the current state of the work tree, determines if there are new commands which can be processed, and if so, enqueues those commands for processing.

Commands are enqueued into a “command queue.” Consuming that queue are command workers which dequeue commands off the command queue, and process them.

Command Processing

The eventual goal of Command processing is to “complete” a command — that is, it has run all its logic necessary for it to be done. Commands generally are completed by the command worker directly — i.e. they simply call a synchronous service endpoint for a service, and once that endpoint returns a response the command is complete. For other commands, the command worker starts processing of a command by enqueuing work to be asynchronously run offsite by some other system, namely our main web application.

To model these possible outcomes, after processing a Command, the command worker marks the command into one of three possible end states:

failed — the command failed to process, and should be retried again.
complete — the command completed processing. No further work to do on the command.
offsite — the command processed, but triggered more work offsite. That offsite worker will complete the command.

In the offsite case, the common scenario for us is the command enqueues a background job for processing by our background jobs system owned by our main web application. That job, once done, notifies Erasure of the completion of the command.

Iterative Processing

Every time a command is completed, the worker enqueues the tree ID for that command into the tree queue. When the tree worker dequeues that ID, the tree worker repeats the processing all over again, looking for processable commands, and enqueuing them. If there are no more commands to process, the work tree is done!

Handling Failure

While we didn’t assume failures to be common, it would be myopic to assume all commands will successfully process every time. Occasionally there will be failures, and so Erasure will need to be designed to handle those failures.

If a particular command fails, Erasure is configured to attempt five retries, with exponential backoff in between each retry. If it still fails after the fifth time, we place the command into a “dead letter queue.” This behavior is used to sequester the chronically failing command out of the way, to ensure following commands in the queue continue to process. SQS has out-of-the-box support for dead letter queues, so this was a very easy thing to add.

If necessary, the commands in the dead letter queue can then be analyzed to inform why they may be failing. In most cases, however, we just want to automatically retry them. This is done by an additional “supervisor” worker, which periodically transfers failed commands from the dead letter queue to command queue for reprocessing. From there the command workers attempt to process them just like any other failure.

The supervisor also looks to see if there are any stuck trees which seem to not be processing, but have not completed. If found, the supervisor enqueues those tree IDs into the tree queue. Given the design of erasure processing, stuck trees shouldn’t technically be possible, but we have observed this happening occasionally in the wild. Enqueuing a tree ID for the tree worker is safe at any time, even for complete tree, as if there there are no current new commands processable or the tree is already done, the tree worker will just noop.

Deleting its own Data

The last piece of functionality needed for Erasure is for it to delete its own data. Erasure data is part of user data we need to delete, so we want it to delete its own data once all other deletion is done. To do this, we have a daily “cleanup” job which removes data from work trees which have been completed.

Results

Since its launch, Erasure has been quite successful. It has processed approximately 1.3 million commands. While there have been a handful of small errors which have come up requiring operator intervention or code changes, nearly all commands have run successfully without issue. We feel this is a testament to both spending time upfront to understand and design for the core requirements of the system, as well as structuring process for idempotency, resilience, and the handling of failure.