Considerations with K8s NetworkPolicy
Our Kubernetes journey has been a fun one, but it hasn’t been without its twists. One of the best practices when deploying applications into your Kubernetes cluster is to use a NetworkPolicy
to ensure your pods can only communicate with what they need to.
Well, it turns out that you can really mess things up when you don’t give these things the proper considerations, so allow us to explain some of the gotchas we and others have experienced.
Your pods are OPEN by default
The act of installing a Network Policy Controller, like Calico or Flannel, is a bit like fixing a lock to your door. If you do nothing with the lock, your door is still open.
Remedy: You can install a default deny-all
rule to your namespace. This will isolate all pods and only open them when you create the network policies to do so. The instructions for how to do this can be found here.
Your deny-egress
Network Policy will literally deny all egress
That includes egress to really useful pods like your kube-dns
pod. Without it, none of the service discovery functionality will work, because it can not resolve that DNS name. You can test this by using the kubectl exec
command to dive onto your pod and running an nslookup
.
The following command should successfully resolve on all pods. If it can’t, something is very wrong:
nslookup kubernetes.default
If you get a response that contains something like this, you’ve got one of two problems.
nslookup kubernetes.defaultServer: <NAMESERVER IP>Address: <NAMESERVER IP AND PORT>** server can't find kubernetes.default: NXDOMAIN* Can't find kubernetes.default: No answer
- You can’t reach your DNS server but it’s up.
- Your DNS server is down (Stop reading articles and start looking at logs!).
Remedy: You can explicitly allow egress from your application over port 53
— Doing this is easy and instructions can be found here. This will solve the first problem. For the second problem, you’re on your own.
You can’t apply Network Policies to the entire cluster by default
If we take our default-deny
policy from before and inspect it, it looks like this:
kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
name: default-deny-all
spec:
podSelector: {}
ingress: []
We might be fooled into thinking that this will apply across the board, but it won’t. A more accurate way to represent this yaml would be to include the default namespace:
kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
name: default-deny-all
namespace: default
spec:
podSelector: {}
ingress: []
If you apply a single default-deny
policy, it applies to the namespace you select or the default namespace if you don’t select one.
Remedy: Calico offers a custom resource definition called a GlobalNetworkPolicy
that allows you to apply networking rules to the entire cluster, regardless of namespace. The Yaml to create this can be found here and a detailed walkthough can be found on the calico website.
Network Policies shallow merge, not deep merge
Let’s say I declare a NetworkPolicy
to allow ingress to my app:
kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
name: ingress-policy
spec:
podSelector:
matchLabels:
run: nginx
ingress:
- from:
- podSelector:
matchLabels:
ingress: "yes"
And I declare another policy that allows egress from my app:
kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
name: ingress-policy
spec:
podSelector:
matchLabels:
run: nginx
egress:
- to:
- podSelector:
matchLabels:
egress: "yes"
They will merge beautifully. The two policies apply to the same matchLabels.
IF, however, you accidentally include your ingress
field in your second policy, so it looks something like this:
kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
name: ingress-policy
spec:
podSelector:
matchLabels:
run: nginx
egress:
- to:
- podSelector:
matchLabels:
egress: "yes"
ingress: {}
That will not merge and you have just blown away your ingress
rules. This is a bit of a silly mistake to make when you’re writing your yaml by hand, but when you’re templating using helm charts, this is a very easy mistake to make.
Remedy: Just pay some bloody close attention. Network policies will not perform a deep merge, only a shallow merge. One gets blown away from the other.