Resolving getting locked out of a Compute Engine
Imagine you get “locked out” of your Linux compute engine because you made some kind of administration error such as disabling SSH or blocking SSH through a firewall locally on the machine (eg. iptables of ufw). How can you login to investigate and perform repairs? Since we normally use SSH to login, and our premise is that SSH is not available, we appear to be stuck.
One solution is to use the concept of the serial console. On a physical Linux machine, we would be able to login at the console. This is where we would see our friendly “login:” prompt. Historically, Unix machines had arrays of terminals attached via serial cables. Associated with each terminal, there would be a Unix process called getty running for each terminal. It is this process that is responsible for spawning login to allow us to login.
Since a Compute Engine is a physically remote Virtual Machine, there are obviously no attached terminals we can use, however GCP provides a logical equivalent called the Serial Console. We can get to the Serial Console of a Compute Engine through the Cloud Console. By default, it shows us the logs generated during system boot but prevents us from interacting with it. Putting it another way, it is read-only. We can enable it for input allowing us to login. To achieve this, we can set the
serial-port-enable metadata to the value of
1. The default (even if omitted, is the value
Once enabled, we get to see a login prompt where we can enter a login userid/password pair. But what credentials can we use to login? By default, there are no login credentials available to us. What we would need to do is explicitly define a user and associated password that we can use for a login. However, we have a bootstrap problem. Assuming we are locked out of our Compute Engine, we can’t login to add a userid/password to allow us to login.
The solution is to create a startup script that adds a userid/password pair. If we define this script and then restart our Compute Engine, during the later part of the boot process, the startup script will run which can create our identity which we can then use to login. Once the startup script has completed and the VM instance started, we can then login through the serial port.
Here is walk through of these steps assuming, at the outset, that we are locked out. We assume that the machine is otherwise able to boot correctly.
We created an Ubuntu Compute Engine instance and ran:
sudo iptables -A INPUT -p tcp --destination-port 22 -j DROP
which blocks port 22. This is a simulation of being “locked out”.
We enable serial port login by setting the Compute Engine instance metadata
1. In Cloud Console there is a checkbox area which makes this simple:
Next we shutdown the Compute Engine instance.
Once shutdown, we can define our startup script by adding custom metadata with the key of
startup-script and the value of:
useradd --groups google-sudoers tempuser
echo "tempuser:password" | chpasswd
Now we restart the Compute Engine. This will cause the startup script to execute which will add our temporary user. When the Compute Engine has started, we will see a button in the Console of the details of our instance.
We have to press the enter key once in the resulting window to cause the login prompt to appear:
We can now enter our userid and password (eg. tempuser / password). We will be logged in as our temporary user. We can now switch to a root user shell using:
At this point, we are now in the machine with all the authority we may need. We can perform our investigation and repair. When we are happy that the environment has been fixed, we should shutdown the VM instance and remove the startup script that creates our temporary userid and we should also uncheck the box that enables serial port login.
We should restart the VM one more time and login with SSH and should explicitly remove our temporary user:
And this concludes the recipe. We illustrated the tweaking of our VM instance configuration using the Cloud Console administration interface but we could also have performed the same tasks using the
gcloud command line interface.