Few tips and tricks about GCE startup scripts

In Google Cloud it is possible to configure a VM with a startup script that will be launched each time the VM is started. Running the startup script is easy and there are plenty of blog posts describing how to do it including GCP public documentation. The startup script provides you a universal solution for the problem to define the commands to be run each time the VM boots. Like with anything there are few caveats.

One time startup script

AWS provides a functionality to distinguish between EC2 startup and launch events. This provides a functionality to run a script only the first time an EC2 instance boots. GCE does not provide this functionality directly. However, the way to do it is simple ‒ make the script’s run conditional. The easiest way to implement it is to create an empty file at the end of the script execution and condition the startup script execution with the file existence:

if [[ -f /etc/startup_was_launched ]]; then exit 0; fi

touch /etc/startup_was_launched

Another way is to leverage guest attributes of the GCE instance metadata. It will work similar to the file but the test will have to send an REST API call to the metadata server. The use of guest attributes is less desirable because they can be read and modified by anyone while the file is created and owned by root.

How to know that startup script finished running?

Sometimes when you write shell scripts or other automation that run VMs you need to figure out when the startup script finishes running. For example, you might need to do some actions after the VM is launched and everything that had to be started on the VM is running. There is no managed solution for this at the moment. You can use Cloud Logging to capture the startup script output and to look for lines start with “startup-script”:

# INSTANCE_ID=$(gcloud compute instances describe $VM_NAME \
--zone=$VM_ZONE --format=”value(id)”)
# gcloud logging read "resource.type=gce_instance AND \
resource.labels.instance_id=\"$INSTANCE_ID\" AND\
jsonPayload.message=~\"^startup-script:\""

However, these lines present an event of execution and do not capture when the execution actually finishes, so it is impossible to determine when all commands of the startup script end running. Another way to do this is to apply the previous advice and use a file or a guest attribute as a marker of the end of the startup script execution. To use this the startup script should be aware of the marker and to reset it at the beginning. The marker also should be selected carefully to avoid being destroyed as a part of GCP management activities (e.g. disable guest metadata on VMs) or deleted by maintenance processes of the VMs OS. One more way, proposed by Nazia Mahimi is to capture the the serial output by running:

# gcloud compute instances get-serial-port-output $VM_NAME \
--zone=$VM_ZONE | grep \
"<instance_name> systemd: Startup finished"

This command can be launched periodically after the VM launch to check whether the startup script finished executing. It also does not require any modifications of the startup script to cleanup and set the end script markers. There are a couple of consideration about this method. (1) It requires that a serial port connection to GCE instance will be allowed. If your GCP org enforces constraints/compute.disableSerialPortAccess Org. policy constraint the command will not work. (2) Considering a security practice of minimizing attack surface and the fact that VMs are already providing SSH/RDP connectivity it makes no reason to open another connectivity option.

The recommended solution would be to use the already existing SSH/RDP connectivity to parse the execution logs of the startup script. The logs are captured into /var/log/syslog and can be queried by looking for a line with "startup-script exit status":

grep -m 1 “startup-script exit status” /var/log/syslog

The status is captured as 0 (success) or 1 (failure) and can be parsed from the captured line. Mind to check timestamp to be sure that it happened recently and you aren’t capturing the log line from the previous restart.

To run this line you will need an ability to run SSH command on VM:

gcloud compute ssh $VM_NAME --zone=$VM_ZONE --ssh-flag="-q" \
--command='grep -m 1 "startup-script exit status" /var/log/syslog' 2>&-

The 2>&- guarantee that you do not get gcloud error output in a case the VM did not start or SSH port is not available at a time you execute the command.

I used the following bash script to probe for the shell script to finish on first launch. It provides a progress indication wail waiting for the “right” log entry:

The script will not work correctly for multiple VM restarts since it will always capture the first appearance of the "startup-script exit status" and will exit. Introducing a time threshold (in the following example it is 5 minutes) and reversing the syslog parsing from bottom up allows to resolve the problem (assuming that VM does not get restarted frequently):

Use of environment variables in startup scripts

It is possible to inject custom values into startup scripts. However, it might be a hindrance when the startup script creates a shell script file which is supposed to use some environment variables. The following command creates a VM which will have /etc/sample.sh shell script:

gcloud compute instances create $VM_NAME --zone=$VM_ZONE \
--metadata startup-script='#!/bin/bash
cat > /etc/sample.sh <<EOF
#!/bin/bash
echo "my home directory is at $HOME"
EOF'

The expected content of the /etc/sample.sh is:

#!/bin/bash
echo "my home directory is at $HOME"

but running the above startup script results with the /etc/sample.sh that looks like:

#!/bin/bash
echo "my home directory is at "

It is because the $HOME will be evaluated to an empty string following the custom values injection logic (there is no such key in the VM instance’s metadata). The solution is simply to escape the dollar sign and then the shell script will look like expected:

gcloud compute instances create $VM_NAME --zone=$VM_ZONE \
--metadata startup-script='#!/bin/bash
cat > /etc/sample.sh <<EOF
#! /bin/bash
echo "my home directory is at \$HOME"
EOF'

Enjoy startup scripts 👍

--

--

A collection of technical articles and blogs published or curated by Google Cloud Developer Advocates. The views expressed are those of the authors and don't necessarily reflect those of Google.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
minherz

DevRel Engineer in Google Cloud, specializing in Observability and Reliability. I try to whisper to horses in free time.