The GDPR has thankfully given individuals greater rights to access data held by organizations about them. (Of course, this has some caveats in practice: the data subjects have to be EU citizens residents to avail of the protection afforded under the Regulation by the European Union — among other fine technical points.)

As readers of this blog are by now surely aware I have a (personal) and longstanding interest in keeping copies of all my own data — whether it’s on my local network, on a cloud I manage, or on a private cloud operated by a SaaS company which I entrust my data to. (If not, check out my backup documentation).

For that reason, I have begun a Github project documenting the data export approaches of various cloud providers.

Observations and Comparison

Some observations from that process so far:

  • The approaches taken by SaaS companies and cloud providers are highly inconsistent and non-standardized: They range all the way from “we let you export all your data automatically at any time” (Twitter) to “here’s a chunk of JSON for you; do what you like with it” (Trello) to “write to our support people and we’ll do it within 30 days” (Reddit) to “only if you pay us” (Asana).
  • Some companies only allow you to export your own data if you are a paid customer. See above.
  • Some SaaS companies provide users with the ability to automatically create on-demand exports/snapshots of the data they have generated by using the service. Others (at the time of writing: Reddit, Quora) require that you interface with a human on their support team to request the data export who will then deliver it back via a support ticket (or directly through the platform) within a guaranteed timeframe. Others let you get some kind of export automatically, but it’s just a pile of unstructured JSON that is difficult to make sense of or do much with (Trello).

Additionally what you get back when you request/automatically download a data export varies quite widely too. You might receive:

  • All the data that you have ever contributed to the platform (Twitter)
  • All the articles that you have contributed to the platform but none of the images that you have used, as those remain locked up in the provider’s CDN (Medium)
  • A CSV (GoodReads)

Initiating the backup might be facilitated by a dedicated tool (Google Takeouts / G Suite) or else you might need to know how to compress an archive to pull out your files (Cpanel).

So my conclusions so far are:

  • Sadly no, not all SaaS companies allow users to take a backup of their own data.
  • Approaches to user data export remain highly inconsistent among providers.
  • Those that have taken steps to improve their data export options in order to become compliant with GDPR are allowing all users to avail of the functionality — whether they are based in the EU or not.
  • A cross-industry, coordinated, universal, standardized, and GDPR-compliant approach to user data export would probably be a good thing.

If you are also interested in data portability, exports, and backups, please feel free to follow my repository on Github: danielrosehilljlm/CloudBackupApproaches

