Instapaper Outage Cause & Recovery
Brian Donohue
66868

Been there: several years ago at a customer we hit the 2TB limit in ext3, although not in RDS.

Downtime wasn’t an option, so while preparing a long term solution (migration to XFS) the quick workaround was to:
* identify old data
* dump it
* delete it
Freeing 0.01% of the space was enough to bring the site up after just few minutes of downtime.

It is easier to tell a small subset of your customers that their data is _temporary_ not available, instead of having a so long outage affecting all your customers.

I also think that another option you had was to drop one of the secondary index. Performance will degrade, but reads are easy to scale out.