Data in the cloud are not invincible

D
8 min readJul 20, 2023

--

This article will explain about backup tools, data recovery principles, and how to protect your data and keep your services running.

Any cloud provider or cloud service can lose its data, there are many examples. Though the cloud takes care of many issues, the customer does not deal with many of those in comparison with traditional on-premise applications; the cloud cannot take care of everything.

Data protection & Data consistency are a few things to name typically not taking care of in full. It is essential to understand why and what’s missing. It is not because cloud providers does not want to, but because they cannot do that for the customer.

Complete Data Protection

Complete Data Protection consists of two equally important parts:

High availability (HA) & Backup and Disaster Recovery.

High availability, in essence, is online synchronous replication between two or more nodes in your cloud or cluster it can replicate your data between nodes within close proximity or even between regions/states/countries or even sometimes continents. So, when one node goes down, you continue to work without even noticing it, and HA if configured does a great job. Still, unfortunately sometimes, as with any service, there can be disruptions like firmware, software bugs, human error, and all sorts of disasters, after which you can temporarily lose access to your Data or even lose all of it. In addition to that, you might lose data if your employee intentionally or unintentionally destroys the company’s data, a developer accidentally modifies Data, or a virus infects your computer and starts to erase your data. Oh yes, even cloud and SaaS is vulnerable to that, also

If there is no such thing as ransomware for some cloud providers at the moment, SaaS cloud service does not protect it from ransomware, sooner or later there will be one for each cloud and SaaS.

The strength of High Availability is: once you modify or add new data in one place, you synchronously modify/add it in all others. Hence, if one place fails, you continue to work, your data in it’s the most recent state are available. But it is also a great weakness: once you deleted or corrupted your Data, you deleted/corrupted it everywhere. That’s where the second part of data protection, Backup, comes to play an important role when HA couldn’t help.

Backup allows you to copy data to a few locations on a few types of media, so in case your data will be damaged, destroyed, modified, corrupted, or inaccessible, you still be able to lay your hand on it and restore your business. To achieve that, the only way to do it is to make sure you DO NOT synchronously mirror data with HA. Instead, you copy your data from time to time, so you’ll have versions of your data instead of just “Current” version, and you could restore it from one of them. This implies that if your HA dies and no longer providing access to your Data for any reason, your Data in Backup will not be the most recent but rather a version at a point in time when you last made the backup. This means if HA fails, you lose some of your most recent data, an dit will not be captured in a backup, so you need to figure out what is the maximum time frame for the newest Data your business can afford to lose. Backup typically restores data back to its original environment but in some cases can restore to a new location.

Disaster Recovery can use backed up data to restore and run it in a new environment or datacenter that was used as a standby waiting for the main environment to fail. With Disaster recovery while you are not using it its typically possible to reduce resource consumption, so you do not need as much CPU and Memory, and mostly need storage to keep your DR. But once you experienced a disaster and wish to recover to the DR site, you’ll probably need as much resources as you used to have on the main site.

Responsibility for Backup & Data consistency. Expectation adjustment

Backup & Data consistency is a burden on the customer’s shoulders. cloud providers does not do it because they simply cannot guarantee Data protection in full. They do their best to protect customers with High Availability, and in most of the cases, they do it extraordinarily well.

In your contract with a cloud provider, chances are, they do not guarantee you nether to restore your Data at all nor to restore it in adequate time any business would expect. That is simply the way how all cloud works. Thats why a Big-Brother cloud provider can be down for example, 4 hours, without the customer be able to sue them.

Multi-Tenancy

When you have multiple clients running in a cloud, and numerous customers basically can share (invisible to the customers) a server node, you have a mess of data; thus, it is time-consuming and hard to distinguish and separate data of one customer from another in case of disaster. Moreover, sometimes in case of disaster, it is simply impossible from a technical perspective of view to restoring your data in a multi-tenant cloud environment for the cloud provider. That’s why cloud providers typically don’t show backups of their infrastructure to the end customer because it might contain someone else’s data.

What is Data consistency? Some examples

Data corruption can be physical when data physically damaged after a disaster like a flood or electricity loss, for example, or logically corrupted after a virus ransomware, firmware bug, or a user destroyed some data.

Data corruption is opposite to Data consistency.

For example, someone can intentionally (an angry employee leaving the company or a virus ransomware) or unintentionally (an employee or a developer) can corrupt or delete your data. Think what would happen if someone would delete a field in an database table with all the data, or accidentally remove all the @ symbols in your email field. Such data corruptions are not the easiest to fix. Sometimes its unique business process or a specialist within an organization which can determine whether your Data is consistent or corrupted while no one else could notice that. Sometimes data can be considered by one organization as corrupted but be perfectly fine to function for another. For example, some organizations can live with only customer phone, but others REQUIRE emails for their business process to continue. Therefore, the cloud cannot possibly know how to restore such logical data corruption.

Where to put @ in mmureyincorp.com?
mmurey@incorp.com OR mmureyin@corp.com

Data Recovery

Data Recovery (or sometimes Data Restoration) is a process to restore customer entire organization back in-place. Time to time cloud providers perform an internal backup in case of a regional disaster so they can restore back using Data Recovery. Important to note, there is no magic, and if Data last backed up, for example, 3 hours ago and your region experienced a disaster, your data would be restored at the point of your last Backup, not the very latest state of your data just before the failure. And all the data that was generated and modified during that 3 hour window will be lost, so your business needs to understand and be prepared for such an unfavorable scenario.

RPO & RTO

In the example below, 4h is Recovery Point Objective or RPO, simply put is time between most recent backups. If you are going to use Data Recovery, you will be restoring the entire dataset (not individual records or tables). Also there is another thing called RTO that you need to consider. RTO is a recovery time objective and simply put, the time needed to restore your last Backup back. Note that sometimes previous backups might be corrupted as well due to numerous reasons like bugs on the lower level of the cloud infrastructure not visible to you. But let’s imagine your last Backup was successful and you restored your entire org back, how long it would take? Well, I’m not going to sugar code it. Often it can take a long time, hours, or even days or weeks.

I am now imaging myself in a situation, where my organization has not just lost our most recent 4 hours of data, now my company going to wait six more weeks since the disaster occurred to restore. And remember, there is no guarantee the cloud will restore your service with data in the first place? Plus, for some clouds this process can cost you $$ regardless of success. Can your organization afford such a long time? So that’s why you must take charge of your backups and do them regularly.

The cost of data loss

Do you think it is not a possible scenario, or too pessimistic? There is no magic; it’s a technology, which means even Big Brother cloud providers can have downtime and lose data from time to time.

There were multiple examples where the Big Brother cloud providers lost 4 hours of customers’ Data or more

https://www.businessinsider.com/salesforce-lost-4-hours-of-customer-data-2016-5?op=1, and this is not the only one example. You should also know that the cloud cloud might look like a single thing. Still, in reality, it consists of regions, pods, racks, servers, data centers, and one customer affected with a disaster not going to be affected by another disaster in another region and vice versa. When it happens, it is not like a entire cloud breakdown; therefore, those events in news media are not such hot topics, though you still can find them.

The cost of data loss consists of:

  • Labor cost
  • Recovery cost
  • Reputation
  • Revenue impact
  • Non-Compliance
  • Productivity

there is dramatic statistic showing majority of businesses go bankrupt if their Data is not restored during a short period.

Is cloud so bad?

If, after reading this, your mind crossed such a thought, you got me wrong. cloud is a very innovative, very reliable, Highly Available service, and almost everything I’ve told you here in this article about applies to all other similar services and self-hosted data centers. But as I stated already for a few times, its technology, not magic, all sorts of things can happen. And cloud technically cannot help with ALL of them.

There are tasks the client must perform themselves to ensure complete data protection.

Data protection IS a customer’s burden and MUST be done by the client themselves because only they possibly know whether their data consistent and it’s their data after all. The customer is responsible for their data protection, not the cloud. It’s like 2+2=4.

Why should customers take a look at Enterprise data protection backup services?

With Enterprise backup systems your organization can quickly schedule Backup and self-managed restoration on an table, record, and other levels to restore only Data or metadata you need with intuitive graphical interface or API, simple, fast, and granular data recovery.

It is typically relatively easy to connect this service to your organization and test-drive.

But the backup solution can become more than that. Some customers like Healthcare providers with strict internal & external compliances might need to store data in another platform, another region, or even on their premises and might have to follow and test their data against other compliances and rules. Moreover, versions of data & metadata captured in your backups can be used for the development process; therefore, it becomes more than just as a Backup and Recovery solution.

Summary

Companies should be aware that backup part of the data protection and data consistency is on their side, not the cloud.

And they could prevent & reduce damage & recovery cost, reputation, revenue impact, non-compliance & productivity in case of a disaster, intentional or unintentional data corruption.

See also:

Kinds of Data Protection
Data Protection technologies comparison

--

--