Kubernetes Time Traveling — Chaos Engineering with Gremlin
This tutorial shares how you can utilize the Gremlin Time Travel attack to change clock time. This attack is cloud-agnostic and will work across AWS, GCP, Azure, DigitalOcean, and more.
Here are a few reasons to use the Time Travel attack:
- Ensure your systems can effectively handle certificate expiration
- Prepare for unknown-unknown incidents caused by clock skew
- Prepare for unexpected downtime
Prerequisites
- A Gremlin account (sign up here)
- Your Gremlin daemon credentials
- A kubernetes cluster
Time Travel a Kubernetes node using Gremlin
Kubernetes architecture is commonly 1 primary and 2 or more nodes which are replicated from the primary. When the primary dies the nodes are ready to replace it. When one node dies another will be ready to replace it.
Step 1 — Install the Gremlin Agent
The simplest way to install the Gremlin agent on your Kubernetes cluster is to use Helm. If you do not already have Helm installed, go here to get started. Once Helm is installed and configured, the next steps are to add the Gremlin repo and install the client.
Add the Gremlin Helm chart:
helm repo add gremlin https://helm.gremlin.com
Create a namespace for the Gremlin Kubernetes client:
kubectl create namespace gremlin
Next you will run the helm
command to install the Gremlin client. In this command there are three placeholder variables that you will need to replace with real data. Replace $GREMLIN_TEAM_ID
with your Team ID from Step 1.1, and replace $GREMLIN_TEAM_SECRET
with your Secret Key from Step 1.1. Replace $GREMLIN_CLUSTER_ID
with a name for the cluster.
If you are using Helm v3, run this command:
helm install gremlin gremlin/gremlin \
--namespace gremlin \
--set gremlin.secret.managed=true \
--set gremlin.secret.type=secret \
--set gremlin.secret.teamID=$GREMLIN_TEAM_ID \
--set gremlin.secret.clusterID=$GREMLIN_CLUSTER_ID \
--set gremlin.secret.teamSecret=$GREMLIN_TEAM_SECRET
For more information on the Gremlin Helm chart, including more configuration options, check out the chart on Github.
Step 2 — View the current clock time and disable NTP
Use the built-in Linux date tool check the current system time
date
You will see a result similar to the following:
Sat Mar 2 00:44:08 UTC 2019
Disable NTP on the instance:
sudo timedatectl set-ntp false
Step 3 — Creating a Time Travel Attack against a Kubernetes node using the Gremlin App
You can use the Gremlin App or the Gremlin API to trigger Gremlin Attacks and Scenarios. You can view the available range of Gremlin Attacks in Gremlin Help.
To create a Time Travel Scenario, click Scenarios in the left Navigation bar click to create a new Scenario
Host targeting should be selected by default. Click on the Exact button to expand the list of available hosts, and select one of them. You’ll see the Blast Radius for the attack is limited to 1 host.
Click “Choose a Gremlin,” and then select State and Time Travel. Leave the Length set to 60 seconds. Leave the radio button for NTP set to “No,” as we’ve already disabled NTP on the host. Leave the offset set to 86400 second. That’s the amount of clock drift that will be introduced. Then hit the green Unleash Gremlin button.
Next click to save your scenario:
Now you can run your Kubernetes Time Travel Scenario:
When your Scenario is finished you will be prompted to add your results to the Gremlin App.
Step 4— Check the new adjusted clock time
Using the built-in Linux date tool check the adjusted system time:
date
Conclusion
How does changing the clock time impact your Kubernetes cluster? Share your findings in the comments below!