Enforce your policies as code on Kubernetes using Gatekeeper [Part2]
This is the second part of my previous Gatekeeper blog. If you have not seen the first one yet, please take a look in the link below
In this article, I want to share with you the rest of the Gatekeeper features that we did not have a chance to touch last time.
Basically, there are 2 things we have not discussed:
- Occasionally, we cannot just make a decision solely based on the object being created or updated. classic example is, when we want to enforce that only unique Ingress hostname can be created in the cluster. In order to achieve that we need more information about existing ingresses from the cluster. Sync feature is the key man for this scenario.
- Secondly, What if there are resources pre-created before we apply the policies into the cluster. how do we detect it? Audit feature comes into the picture in this scenario
Sync
To use cluster information to aid our decision. Gatekeeper requires us to create a sync ConfigMap that will tell Gatekeeper to sync information from the cluster into its in-memory cache and that information will be accessible via data.inventory
Applied this configmap will give your policy access to the cluster’s resources information (namespace, in this case), we can use this data further in our decision making process.
For example, if we want to crate a policy to enforce that all the resources being created into specified namespace must contain all labels of that namespace.
- Let’s start by creating a namespace with labels “author”
2. Let’s draft our policy template,
we will name it ResourceNamespaceLabelConstraint.
Looking at the body, you will see what I did was just to intercept the creating object to inspect it for further steps.
then, we apply it with actual constraint that apply to all core resources under our test namespace (test-opa-2)
3. Let’s create a test service to capture the request object
the example of the input that Gatekeeper received can be seen in this gist I created : https://gist.github.com/InsomniaCoder/58d42f678b1788121d75b3f30acf8a27
basically, it consists of parameters part of the constraint under parameters object and object being created part under review object
OK, all set up is done, it’s time to write our desired policy.
4. We will get the labels of the object being created from an input object like this
input.review.object.spec.metadata.labels
and we will use the data we query from the Gatekeeper’s in-memory cache that we have synced in earlier
From the Gatekeeper’s official documentation,
In order to use the Namespaced object, this is the format to access it
data.inventory.namespace[<namespace>][groupVersion][<kind>][<name>]
- Example referencing the Gatekeeper pod:
data.inventory.namespace["test-opa-2"]["v1"]["Pod"]["test-xxxxxxx"]
For cluster’s level object, like our Namespace, the format looks like this
data.inventory.cluster[<groupVersion>][<kind>][<name>]
in our case, we can access our namespace labels with
data.inventory.cluster["v1"].Namespace["<namespace>"].metadata.labels
5, We will compare the labels from the created object and the labels from the namespace. let’s look at the code to see how it’s done.
- Disclaimer*: I’m just a rookie in OPA world, for proper implementation you should find the solid example from official libs and community: https://github.com/open-policy-agent/gatekeeper-library/tree/master/src/general
- This code is just to give you an example how to utilize Sync
after we implemented the rules with above code, Don’t forget to re-apply the template
Testing
Our created Namespace has a single label (excluding default one), which is “author”:”insomnia-coder”
and our service also has a single label which is “app”:”test”
if we attempt to create this service, we should get this error
Now, let’s satisfy the policy by adding in the author
label onto our service
Job done! we are good to go with the policy as expected.
Audit
After we implemented sync with our policy, we are relieved from the burden of worrying someone might be committing a deadly action of forgetting to label their resource properly 😄
However, how do we ensure that all the resources pre-existed in the cluster prior to our superb policy is effective will be complied with.
Gatekeeper comes with the audit functionality that will keep checking cluster resources against the policies we applied. The result of this audit action will be shown in the status
field of the constraint.
By default, Gatekeeper will audit all resources in the cluster, and it will request for the resources from Kubernetes api-server during each audit cycle (every 60 seconds by default). This behavior, in a large scale cluster, can silently cause a devastating effect to the cluster itself by bombarding the api-server and cause api-server to not be able to serve other tasks.
If you are interested, I highly recommend you to check this thorough walk through of the problem in this blog
so, to prevent that we could change the behavior of Gatekeeper to only validate resource in OPA cache by setting this value in Helm to true
in addition, if we also set auditMatchKindOnly to true, it will only validate resource that we specified in the policy constraint to make it more efficient.
Enough of the lessons, Let’s get to the real example.
Assume that we have created a service without author
label before we applied the constraint template.
After we have applied the constraint, we can check our Constraint resource easily by running
kubectl get resourcenamespacelabelconstraint resourcenamespacelabelconstraint -o yaml
You should see the result like this (abstracted)
Observe that there are 2 violations populated from our Service we created before the policy and the Endpoints object created from the service.
I have not found any further steps to take on from this, but in my opinion, we could add something to the engine or build a custom operator to send an alert for these violations if needed.
Hope you guys enjoy the content, I think we are all set to start writing some policies for our cluster now, Until next time. Cheers!