Advanced Gatekeeper policies — rejecting a node assignment
It has been some time since I wrote the two episodes of Gatekeeper blog post. If you did not know what they are, you can check them out.
Since then I did not have many chances to work with Gatekeeper in my day to day work.
However, I have recently crossed path with them again and got to write a couple of policies that I find interesting. so, I want to share them with you guys what I learned in this blog post.
What policy did I write?
I have a specific use case where I need to reject pods with some conditions from being scheduled into a node that as well have a specific condition.
For example, if a pod with annotation
annotations:
app: a
got scheduled into a node with the condition of ConditionA= false
apiVersion: v1
kind: Node
metadata:
name: node-a
status:
conditions:
- type: "ConditionA"
status: "False"
It should be denied, as simple as that.
Please noted that this is a specific use case and it’s very likely that you don’t even need to do this in your settings, things like pod/node affinity should work in normal case. I will not go into details about the condition as I want to share what I discovered I could do with Gatekeeper not the use case specifically.
At first, I was trying to write a policy that validated incoming pods and checked their condition.
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: PodConstraints
metadata:
name: reject-pods-on-condition
spec:
match:
kinds:
- apiGroups: [""]
kinds: ["Pod"]
However, I realized quickly after that, this could not achieve what I wanted. Why? Because it’s how Kubernetes works… Let me explain.
Basically, we are asking Gatekeeper to intercept when there’s a request to API server regarding pods.
violation[{"msg": msg}] {
appAnnotation := input.review.object.metadata.annotations["app"]
}
Inside the input.review.object is a spec of a pod that is being sent to the API server, and if you look into the pod’s creation event. You will see that there’s no information about a node that the pod will run on.
This is because the node assignment is determined later in the Kubernetes scheduling process by the kube-scheduler.
so, It is not possible for me to perform this validation using the pod creation as the source of the policy. Then how can we continue? First, we have to understand how the node assignment works in Kubernetes.
Node Assignment
This is actually how pod is assigned to a node in Kubernetes
- Pod Creation Request: The process starts when a user or a controller (e.g., Deployment, StatefulSet) creates a Pod object or updates an existing one. This request is sent to the Kubernetes API server.
- API Server Validation: The API server performs basic validation on the request to ensure it conforms to the schema and access control policies. If the request is valid, it is accepted.
- Admission Controller: After the request is accepted, it is passed to a series of admission controllers, which are responsible for enforcing policies and making decisions about whether to allow or reject the request. Our Gatekeeper’s policies are also one of these.
- Scheduling Decision: Once the request is admitted by the admission controllers, including Gatekeeper, it proceeds to the scheduling phase. The Kubernetes scheduler, kube-scheduler, is responsible for making scheduling decisions.
- Node Selection: The scheduler evaluates each Pod and selects an appropriate node for it based on various factors. These factors include resource requirements, node affinity/anti-affinity rules, node taints and tolerations, and other scheduling policies. It considers the current state of the cluster and node availability to make an optimal decision.
- Binding: When the scheduler determines which node should run the Pod, it creates a Binding object. The Binding object specifies the binding of the Pod to the selected node. It contains the name of the Pod, the name of the node, and other relevant information.
- Node Assignment: The Binding object is sent to the Kubernetes API server, which updates the Pod’s internal representation to include the node assignment information. This informs the kubelet running on the selected node about the Pod it needs to run.
so, in short, kube-scheduler determines the best node to run your pod, and creates an object called Binding that tells which node your pod will run on.
Now that we understand how it works, the key is not to capture the pod event for validation, but to capture the Binding creation event instead.
Validate a Binding object
Let’s change our Gatekeeper constraint to match with Binding instead of Pod.
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: RejectConditionAFalse
metadata:
name: condition-a-false
spec:
enforcementAction: deny
match:
kinds:
- apiGroups: [""]
kinds: ["Binding"]
In our template we could now check for the CREATE event of the Binding object and check if the Binding Pod has the annotation we are looking for and the node assigned for it is in the condition we anticipated, then the binding will be rejected. as simple as that.
violation[{"msg": msg}] {
input.review.kind.kind == "Binding"
input.review.operation == "CREATE"
podName := input.review.object.metadata.name
nodeName := input.review.object.target.name
data.inventory.namespace[ns]["v1"]["Pod"][podName].metadata.annotations["app"] == "a"
node := data.inventory.cluster.v1.Node[nodeName]
# implement a function to validate your condition
conditionAFalse(node.status.conditions)
msg := sprintf("Pod '%v' cannot be scheduled on node '%v' as condition A is False", [podName, nodeName])
}
Noted that we need to have also access to the two inventories which are Pod and Node because the Binding object created by kube-scheduler does not have all information about your pod, and no information about the node being assigned.
apiVersion: config.gatekeeper.sh/v1alpha1
kind: Config
metadata:
name: config
namespace: "gatekeeper"
spec:
sync:
syncOnly:
- group: ""
kind: Pod
version: v1
- group: ""
kind: Node
version: v1
Inventory is a feature of Gatekeeper to sync information inside the cluster to be used within your policy.
With this, we could already achieve what we desired, but some of you might wonder
What happens to the pod that was rejected the assignment?
That’s a good question. The answer is, during the rejection, the pod will initially stay in Pending state. However, kube-scheduler will keep retrying to schedule the pod to other nodes that could eventually runs the pod.
Now that we can achieve what we wanted to do, I want to touch on another topic which I did not cover as well in the previous posts which is
How to effectively test your policy
Gatekeeper provides you a CLI tool called Gator, which you can use to create a test suite for your policy and integrate them to your favorite CI.
gator test/verify
allows us to test a set of Kubernetes objects against a set of Templates and Constraints. The command returns violations when found and communicates success or failure via its exit status.
In this example, we can create a suite
for our use case targeting the template and constraint we created earlier.
kind: Suite
apiVersion: test.gatekeeper.sh/v1alpha1
metadata:
name: condition-a-false
tests:
- name: should-reject-pod-on-node-a-condition
template: template.yaml
constraint: constraint.yaml
cases:
- name: allowed
object: samples/pod-does-not-have-annotation.yaml
inventory:
- samples/node-not-ready.yaml
- samples/node-ready.yaml
- samples/pod-no-annotation.yaml
assertions:
- violations: no
- name: disallowed
object: samples/pod-has-annotation.yaml
inventory:
- samples/node-not-ready.yaml
- samples/node-ready.yaml
- samples/pod-has-annotation.yaml
assertions:
- violations: yes
Inventory is a set of object that you assume they already exist inside the cluster, in this case, we can mock the node and pod information.
Object is the Kubernetes object we want to simulate the trigger of the policy, which in our case is the Binding object.
and lastly, the assertion is where we specify the expected outcome.
By principle, this should simply work. However, in our case, there is a minor detail that we need to handle. We want to simulate a CREATE event of a Binding object. We cannot put simply the Binding object into the object field as it will not be treated as a CREATE action.
but don’t worry, Gator allows us to work around this by supplying AdmissionReview
as an object. This can be helpful to simulate a certain operation (CREATE
, UPDATE
, DELETE
, etc.) or UserInfo
metadata.
so, this is our AdmissionReview for the Binding creation.
apiVersion: admission.k8s.io/v1
kind: AdmissionReview
request:
kind:
group: ""
version: "v1"
kind: "Binding"
resource:
group: ""
version: "v1"
resource: "bindings"
subResource: ""
requestKind:
group: ""
version: "v1"
kind: "Binding"
requestResource:
group: ""
version: "v1"
resource: "bindings"
requestSubResource: ""
name: test-pod
namespace: foo-namespace
operation: "CREATE"
object: # This is where you put the actual Binding object
apiVersion: "v1"
kind: "Binding"
metadata:
name: test-pod
namespace: foo-namespace
target:
apiVersion: v1
kind: Node
name: node-not-ready
oldObject: null
and that’s it, now we have a working test case for our policy that we can run with
gator verify /gator_files/...
and that’s it for this blog post, hope you guys can learn a thing or two, and just to emphasize again that the intention of this blog is on the capability of Gatekeeper, don’t focus too much on the intention of the example 😉
Thank you for reading!