Application Consistent Snapshots
Disk Snapshots
Snapshots are a way to backup your disk data at any point of time. They are used in scenarios were long term data backup is required and disaster recovery.
In GCP there are 3 types of Snapshots for Compute Engine Instances.
- Standard Snapshot
- Archive Snapshot
- Instant Snapshot
Standard and archive snapshots differ primarily in storage location and cost. Snapshots are incremental by default to avoid billing you for redundant data and to minimize use of storage space.
Archive snapshots have the same benefits as standard snapshots including incremental chains, compression, and encryption.
Need of Application Consistency
If you create a snapshot of your Persistent Disk while your application is running, the snapshot might not capture pending writes that are in transit from memory to disk. Because of these inconsistencies, the snapshot might not reflect the exact state of your application at the time you captured the snapshot. In this scenario, the snapshot is considered crash consistent because it captures the state of the application as if the machine crashed at the time the snapshot was taken.
Optionally, you can pause the application, so that all application transactions complete and the system can flush all pending writes from memory to disk before the snapshot is captured. In this scenario, the snapshot is considered application consistent.
Consistent Snapshots
When you take a snapshot of a Persistent Disk , you don’t need to take any additional steps to make your snapshot crash consistent. In particular, you do not need to pause your workloads. Applications Consistent Snapshots are available for both Standard and Archive Snapshots.
The quality of your persistent disk snapshot depends on how well your applications can recover from snapshots that you create during heavy write workloads. Application consistent snapshots capture the state of application data at the time of backup with all application transactions completed and all pending writes flushed to the disk.
To create snapshots that are application consistent, pause apps or operating system processes that write data to the persistent disk, flush the disk buffers, and sync the file system before you create the snapshot. Depending on your application, these and other steps might be required to ensure that all application transactions are complete and captured in the backup.
To create an application consistent snapshot of your persistent disks, use the following process:
- To prepare the guest environment for application consistency, create custom shell scripts to run before and after the snapshot is captured
- Configure snapshot settings on your VM (virtual machine) instance.
- Create a snapshot with the
guest-flush
option enabled. Theguest-flush
option starts your pre and post snapshot scripts.
Pre-requisite
Update the Guest Environment according to the Operating System. For example, if you are using a Ubuntu VM run the following commands to update the Guest Environment.
sudo apt update
sudo apt install google-compute-engine google-compute-engine-oslogin \
google-guest-agent google-osconfig-agent
Pre-Snapshot Script
- Navigate to /etc/google/snapshots/ directory on your VM and create a pre.sh file.
- Paste the following into the pre.sh file. Running
fsfreeze -f
blocks any running process that tries to access the file system, so use this with caution if your application is latency-sensitive.
#!/bin/bash
sudo fsfreeze -f [example-disk-location]
Post-Snapshot Script
- Navigate to /etc/google/snapshots/ directory on your VM and create a post.sh file.
- Paste the following into the pre.sh file. Running
fsfreeze -f
blocks any running process that tries to access the file system, so use this with caution if your application is latency-sensitive.
#!/bin/bash
sudo fsfreeze -u [example-disk-location]
Edit Guest Environment Configuration file
- Open or create your guest environment configuration file: /etc/default/instance_configs.cfg
- Add the following to configuration file:
[Snapshots]
enabled = true
timeout_in_seconds = 60
enabled true
or false
Whether the application consistent snapshot feature is enabled.
timeout_in_seconds
Integer [0, 300] Default is 60. Number of seconds the pre or post snapshot script can take to finish running before a timeout error. Note that the number of seconds the entire snapshot operation can take to complete before a timeout error is 300 seconds per disk, and this is not configurable.
Save your configuration settings
sudo systemctl restart google-guest-agent.service
If you want to schedule application consistent snapshots for your backup, use the --guest-flush
option when you create the snapshot schedule so that the pre and post snapshot scripts execute before and after each scheduled snapshot.
Schedule Snapshots
To create a snapshot schedule with guest-flush enabled run the following:
gcloud compute resource-policies create snapshot-schedule SCHEDULE_NAME \
--description "MY HOURLY SNAPSHOT SCHEDULE" \
--start-time 22:00 \
--hourly-schedule 4 \
--guest-flush
Replace SCHEDULE_NAME with the name of the Snapshot schedule you want to keep and replace the parameters according to your need of schedule.
Limitations
- Application consistency is guaranteed only by the behaviour of your custom pre and post snapshot scripts, not by the snapshot operation itself.
- When using the
guest-flush
option in your snapshot creation request, no snapshot is created in the event of a script error or timeout.