Application Consistent Snapshots

Disk Snapshots

Kiran K
Google Cloud - Community
4 min readNov 21, 2023

--

Snapshots are a way to backup your disk data at any point of time. They are used in scenarios were long term data backup is required and disaster recovery.

In GCP there are 3 types of Snapshots for Compute Engine Instances.

  1. Standard Snapshot
  2. Archive Snapshot
  3. Instant Snapshot

Standard and archive snapshots differ primarily in storage location and cost. Snapshots are incremental by default to avoid billing you for redundant data and to minimize use of storage space.

Archive snapshots have the same benefits as standard snapshots including incremental chains, compression, and encryption.

Need of Application Consistency

If you create a snapshot of your Persistent Disk while your application is running, the snapshot might not capture pending writes that are in transit from memory to disk. Because of these inconsistencies, the snapshot might not reflect the exact state of your application at the time you captured the snapshot. In this scenario, the snapshot is considered crash consistent because it captures the state of the application as if the machine crashed at the time the snapshot was taken.

Optionally, you can pause the application, so that all application transactions complete and the system can flush all pending writes from memory to disk before the snapshot is captured. In this scenario, the snapshot is considered application consistent.

Photo by Malcolm Lightbody on Unsplash

Consistent Snapshots

When you take a snapshot of a Persistent Disk , you don’t need to take any additional steps to make your snapshot crash consistent. In particular, you do not need to pause your workloads. Applications Consistent Snapshots are available for both Standard and Archive Snapshots.

The quality of your persistent disk snapshot depends on how well your applications can recover from snapshots that you create during heavy write workloads. Application consistent snapshots capture the state of application data at the time of backup with all application transactions completed and all pending writes flushed to the disk.

To create snapshots that are application consistent, pause apps or operating system processes that write data to the persistent disk, flush the disk buffers, and sync the file system before you create the snapshot. Depending on your application, these and other steps might be required to ensure that all application transactions are complete and captured in the backup.

To create an application consistent snapshot of your persistent disks, use the following process:

  1. To prepare the guest environment for application consistency, create custom shell scripts to run before and after the snapshot is captured
  2. Configure snapshot settings on your VM (virtual machine) instance.
  3. Create a snapshot with the guest-flush option enabled. The guest-flush option starts your pre and post snapshot scripts.

Pre-requisite

Update the Guest Environment according to the Operating System. For example, if you are using a Ubuntu VM run the following commands to update the Guest Environment.

sudo apt update
sudo apt install google-compute-engine google-compute-engine-oslogin \
google-guest-agent google-osconfig-agent

Pre-Snapshot Script

  1. Navigate to /etc/google/snapshots/ directory on your VM and create a pre.sh file.
  2. Paste the following into the pre.sh file. Running fsfreeze -f blocks any running process that tries to access the file system, so use this with caution if your application is latency-sensitive.
#!/bin/bash
sudo fsfreeze -f [example-disk-location]

Post-Snapshot Script

  1. Navigate to /etc/google/snapshots/ directory on your VM and create a post.sh file.
  2. Paste the following into the pre.sh file. Running fsfreeze -f blocks any running process that tries to access the file system, so use this with caution if your application is latency-sensitive.
#!/bin/bash
sudo fsfreeze -u [example-disk-location]

Edit Guest Environment Configuration file

  1. Open or create your guest environment configuration file: /etc/default/instance_configs.cfg
  2. Add the following to configuration file:
[Snapshots]
enabled = true
timeout_in_seconds = 60

enabled true or falseWhether the application consistent snapshot feature is enabled.

timeout_in_secondsInteger [0, 300] Default is 60. Number of seconds the pre or post snapshot script can take to finish running before a timeout error. Note that the number of seconds the entire snapshot operation can take to complete before a timeout error is 300 seconds per disk, and this is not configurable.

Save your configuration settings

sudo systemctl restart google-guest-agent.service

If you want to schedule application consistent snapshots for your backup, use the --guest-flush option when you create the snapshot schedule so that the pre and post snapshot scripts execute before and after each scheduled snapshot.

Schedule Snapshots

To create a snapshot schedule with guest-flush enabled run the following:

gcloud compute resource-policies create snapshot-schedule SCHEDULE_NAME \
--description "MY HOURLY SNAPSHOT SCHEDULE" \
--start-time 22:00 \
--hourly-schedule 4 \
--guest-flush

Replace SCHEDULE_NAME with the name of the Snapshot schedule you want to keep and replace the parameters according to your need of schedule.

Limitations

  • Application consistency is guaranteed only by the behaviour of your custom pre and post snapshot scripts, not by the snapshot operation itself.
  • When using the guest-flush option in your snapshot creation request, no snapshot is created in the event of a script error or timeout.

Happy Learning! 😊

--

--