Kinds of Data Protection

D
6 min readJul 20, 2023

--

To ensure Business Continuity you need to understand what principle are out there you can use and apply and utilize across different technologies.

High Availability

This type of data protection is first notable principle that is trying to do its best to make your data available all the time. If you have an HA service or a cluster, it will continue to work even if one or even a few components fail which means your Recovery Point Objective (RPO) is always 0 with HA, and Recovery Time Objective (RTO) is almost 0. With RTO whatever that number is we assume that our service and applications using that service (maybe with a small pause) will survive failure and continue to function and will not return an error to its clients. An essential part of any HA solution is automatic switchover between two or more components, so your applications will transparently switch to the survived elements and your applications continue to interact with survived components instead of the failed one. With HA your timeouts should be set for your applications ( for example it can be 180 seconds or less) so that RTO will be equal to or lower. HA solutions made in a way not to reach those application timeout maximum to make sure they not going to return an error to upstream services but rather a short pause. Whenever you got RPO not 0, it instantly means data protection is not an HA solution. The biggest problem with HA solutions they limited by the distance between which components can communicate, the more significant gap between them, the more time they need all your data to be fully Synchronous across all of them and ready to take over the failed part.

Disaster Recovery

The second notable data protection technique is DR. What is the difference between DR and HA, they both are types of data protection, right? By definition, DR is the kind of data protection which starts with the assumption you already get into a situation where your data is not available and your HA solution has failed for any reason (if you had one in the first place). Why DR assumes your data not available, and you have a disruption in your infrastructure service? The answer is “by definition.” With DR you might have RPO 0 or not and your RTO is always not 0 which means you will get an error accessing your data, there will be a disruption in your service. DR assumes by definition there is no fully automatic and transparent switchover.

Because HA and DR are both Data Protection techniques, people often confuse them, mix them up and do not see the difference or vice versa, they are trying to contrapose them and choose between them. But, now after explanation what they are and how they are different, you might already guess that you cannot replace one with another they do not compete but rather complement each other.

Backup & Archive data protection

This is the most well-known technique. Backup is another type of data protection. Backup is an even lower level of data protection than DR and allows you to access your data all the time from the Backup site for the data restoration typically back to the production site. An essential role for Backup data is to ensure it does not alter your data. Therefore, with Backup, we assume to restore data back to original or another place but not alter backed up data which means not to run DR on your Backup data. Typically backups locally stored in a form of a local Snapshots and replicated some of those local snapshots to a remote location where they are more protected from a local malfunction. With Archives you do not have access to your backups or access may have a significant speed limit or other limits imposed upon data in archive, you might need some time to bring them online before you can restore it back to the source or another location from archive.

Sync vs Async:

Imagine Asynchronous transfer of data time to time to a secondary site, it is obviously a DR technology because you cannot switch to the DR site automatically since you do not have the latest version of your data. That means that before you start your applications, you might need to prepare them first. For instance, you might need to apply DB logs to your database, so your “not the latest version of data” will become the latest one (if possible in the first place). Alternatively, you might need to choose one snapshot out of the last few which you need to restore because the latest one might have corrupted data with a virus ransomware for instance. Again, by definition DR scenario assumes that you will not switch to a DR instantly, it assumes you already have downtime, and it assumes you might have manual interaction or a script or some modifications made before you’ll be able to start & run your services which require some downtime.

Synchronous replication can have two modes: Strict Full Synchronous mode and Relaxed Synchronous mode. The problem with Synchronous replication, similarly to HA solutions, is that the longer distance between the two sites, the more time needed to replicate the data. And the longer data will be transferred and confirmed to the first system, the longer time your application will not get the confirmation from your system.

The relaxed mode allows to have lags and network break-out and after network communication restoration auto-sync again, which means it is also a DR solution because it enables RPO to be not 0.

Strict mode does not tolerate network break-out by definition, which means it ensures your RPO to be always 0, which kind of makes it closer to HA.

Does it mean Synchronous replica in Strict mode is an HA solution?

Well, not precisely. Synchronous replica in Strict mode can also be part of a DR solution. For instance, if you have a DB with all the data been Asynchronously replicated to a DR site and only DB logs synchronously replicated to DR site, in this way we can reduce network traffic between two locations, provide small overall RPO and with DB synchronous logs restore data to the DB to ensure entire DB with RPO 0. In such a scenario RTO will not be so big but allows your DR site to be located very far away one from another.

To comply with HA definition, you need to have not only RPO to be 0 but also to be able to automatically switch over with RTO not higher than timeouts for your applications & services.

How HA, DR & Backup solutions applied in practice?

As you remember HA, DR & Backup solutions do not compete but rather complement each other to provide full data protection. In a perfect world without money where you need to provide the highest possible and fully covered data protection and business continuity, you would need all of them: HA, DR, Backups, and Archive. Where HA is located in one place or even Geo-distributed as far as possible (often as far as up to 700 km), and on top of that, you need DR and Backups. For Backups, you might probably need to place your site as far as possible, for instance, on another side of the country or even to another continent. In these circumstances, you can do Synchronous replica only for some of your data like DB logs and Async for the rest to an intermediate site (think of approximately up to 10 ms network RTT latency) to a DR site and from that intermediate site to another continent all the data replicated Asynchronously or as Backup protection. And from DR and/or Backup sites we can do Archiving.

Summary

HA, DR, Backup and Archive are different types of data protection which complement each other. Any company should have not only HA solution for their data but also DR, Backup, and Archive in the best-case scenario or at least HA solution & Backup, but it always depends on business needs, business willingness to get some level of protection, and understanding risks involved with not protecting the data properly.

See also:

Data Protection technologies comparison
Data in the cloud are not invincible

--

--