Cilium and Security Groups for Pods in EKS

Amit Gupta
12 min readAug 2, 2023

☸ ️Introduction

Security groups for Pods integrate Amazon EC2 security groups with Kubernetes Pods. A user can use Amazon EC2 security groups to define rules that allow inbound and outbound network traffic to and from Pods that you deploy to nodes running on many Amazon EC2 instance types.

Security groups for pods make it easy to achieve network security compliance by running applications with varying network security requirements on shared compute resources.

🎯Goals & Objectives

Containerized applications frequently require access to other services running within the cluster as well as external AWS services. In this article, you will learn how Cilium can be used alongside security groups for EKS pods in supported clusters when running in chaining mode and how AWS uses the classical Networking terms like “Trunk”, “Branch” interfaces. 🙂

⛓What is CNI chaining in context with Cilium?

Number of posts have called out CNI chaining in context with Cilium and hence restricting it to just two points.

  • CNI chaining allows to use Cilium in combination with other CNI plugins.
  • With Cilium CNI chaining, the base network connectivity and IP address management are managed by the non-Cilium CNI plugin, but Cilium attaches eBPF programs to the network devices created by the non-Cilium plugin to provide L3/L4 network visibility, policy enforcement, and other advanced features.

Pre-Requisites

kubectl -n kube-system get ds/aws-node -o json | jq -r '.spec.template.spec.containers[0].image'

602401143452.dkr.ecr.ap-southeast-2.amazonaws.com/amazon-k8s-cni:v1.12.2-eksbuild.1
  • The trunk network interface is included in the maximum number of network interfaces supported by the instance type. For a list of the maximum number of network interfaces supported by each instance type, see IP addresses per network interface per instance type in the Amazon EC2 User Guide for Linux Instances. If your node already has the maximum number of standard network interfaces attached to it then the VPC resource controller will reserve a space. You will have to scale down your running Pods enough for the controller to detach and delete a standard network interface, create the trunk network interface, and attach it to the instance.

Demo

Deploy Security Groups for Pods

  • AmazonEKSVPCResourceController managed policy is attached to the IAM role associated with the EKS cluster:
export EKS_CLUSTER_NAME="cluster1" # Change accordingly

export EKS_CLUSTER_ROLE_NAME=$(aws eks describe-cluster \
--name "${EKS_CLUSTER_NAME}" \
| jq -r '.cluster.roleArn' | awk -F/ '{print $NF}')

aws iam attach-role-policy \
--policy-arn arn:aws:iam::aws:policy/AmazonEKSVPCResourceController \
--role-name "${EKS_CLUSTER_ROLE_NAME}"
  • Version of the AWS VPC CNI plugin running in the cluster is up-to-date:
kubectl -n kube-system get ds/aws-node \
-o jsonpath='{.spec.template.spec.containers[0].image}'
602401143452.dkr.ecr.ap-southeast-2.amazonaws.com/amazon-k8s-cni:v1.12.2-eksbuild.1
  • You need to patch the kube-system/aws-node DaemonSet in order to enable security groups for pods.
    Note- If you are using liveness or readiness probes, then you also need to disable TCP early demux, so that the kubelet can connect to Pods on branch network interfaces using TCP. To disable TCP early demux, run the following command alongside patching the aws-node daemonset.
kubectl -n kube-system patch ds aws-node \
-p '{"spec":{"template":{"spec":{"initContainers":[{"env":[{"name":"DISABLE_TCP_EARLY_DEMUX","value":"true"}],"name":"aws-vpc-cni-init"}],"containers":[{"env":[{"name":"ENABLE_POD_ENI","value":"true"}],"name":"aws-node"}]}}}}'

kubectl -n kube-system rollout status ds aws-node
  • After the rollout is complete, all nodes in the cluster should have the vpc.amazonaws.com/has-trunk-attached label set to true:
    Note- Once the trunk network interface is created, Pods are assigned secondary IP addresses from the trunk or standard network interfaces. The trunk interface is automatically deleted if the node is deleted.
kubectl get nodes -L vpc.amazonaws.com/has-trunk-attached

NAME STATUS ROLES AGE VERSION HAS-TRUNK-ATTACHED

ip-192-168-127-2.ap-southeast-2.compute.internal Ready <none> 4h43m v1.25.9-eks-0a21954 true

ip-192-168-152-227.ap-southeast-2.compute.internal Ready <none> 4h43m v1.25.9-eks-0a21954 true

Deploy an Application

  • To use security groups for Pods, you must have an existing security group and Deploy an Amazon EKS SecurityGroupPolicy to your cluster, as described in the following procedure. The following steps show you how to use the security group policy for a Pod.
    -Note-
    -
    Security Groups must allow inbound communication from the security group applied to your nodes (for kubelet) over any ports that you’ve configured probes for.
    -Security Groups must allow outbound communication over TCP and UDP ports 53 to a security group assigned to the Pods (or nodes that the Pods run on) running CoreDNS.
    -The security group for your CoreDNS Pods must allow inbound TCP and UDP port 53 traffic from the security group that you specify.
    -Security Groups must have necessary inbound and outbound rules to communicate with other Pods that they need to communicate with.
  • Security Group for the Sample Application that we will be configuring below
    Note- This is a sample Security Group and the user can make respective changes to the destination CIDR’s as applicable.
  • When you deploy a security group for a Pod, the VPC resource controller creates a special network interface called a branch network interface with a description of aws-k8s-branch-eni and associates the security groups to it.
    -Branch network interfaces are created in addition to the standard and trunk network interfaces attached to the node.
  • Security Group for the EKS Pods that will require a change for DNS resolution
  • Create a Kubernetes namespace to deploy resources to. You can replace eks-cilium-chain with the name of a namespace that you want to use.
kubectl create namespace eks-cilium-chain
  • Deploy an Amazon EKS SecurityGroupPolicy to your cluster.
    -The groupId is the security group ID of the security group created above.
cat >eks-cilium-chain-security-group-policy.yaml <<EOF
apiVersion: vpcresources.k8s.aws/v1beta1
kind: SecurityGroupPolicy
metadata:
name: eks-cilium-chain
namespace: eks-cilium-chain
spec:
podSelector:
matchLabels:
role: eks-cilium-chain
securityGroups:
groupIds:
- sg-01b33d4bf8ca28b86
EOF
  • Deploy the policy
kubectl apply -f eks-cilium-chain-security-group-policy.yaml
  • Deploy a sample application with a label that matches the eks-cilium-chain value for podSelector that you specified in a previous step.
cat >eks-cilium-chain-application.yaml <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
name: eks-cilium-chain
namespace: eks-cilium-chain
labels:
app: eks-cilium-chain
spec:
replicas: 1
selector:
matchLabels:
app: eks-cilium-chain
template:
metadata:
labels:
app: eks-cilium-chain
role: eks-cilium-chain
spec:
terminationGracePeriodSeconds: 120
containers:
- name: nginx
image: public.ecr.aws/nginx/nginx:1.23
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: eks-cilium-chain
namespace: eks-cilium-chain
labels:
app: eks-cilium-chain
spec:
selector:
app: eks-cilium-chain
ports:
- protocol: TCP
port: 80
targetPort: 80
EOF
  • Deploy the application with the following command. When you deploy the application, the Amazon VPC CNI plugin for Kubernetes matches the role label and the security groups that you specified in the previous step are applied to the Pod.
kubectl apply -f eks-cilium-chain-application.yaml
  • View the Pods deployed with the sample application.
kubectl get pods -n eks-cilium-chain -o wide

NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES

eks-cilium-chain-6684995c8d-jm5rj 1/1 Running 0 140m 192.168.118.216 ip-192-168-127-2.ap-southeast-2.compute.internal <none> <none>
  • In a separate terminal, shell into one of the Pods.
kubectl exec -it -n my-namespace eks-cilium-chain-6684995c8d-jm5rj -n eks-cilium-chain -- /bin/bash
  • Confirm that the sample application works.
curl eks-cilium-chain

<!DOCTYPE html>

<html>

<head>

<title>Welcome to nginx!</title>

<style>

html { color-scheme: light dark; }

body { width: 35em; margin: 0 auto;

font-family: Tahoma, Verdana, Arial, sans-serif; }

</style>

</head>

<body>

<h1>Welcome to nginx!</h1>

<p>If you see this page, the nginx web server is successfully installed and

working. Further configuration is required.</p>

<p>For online documentation and support please refer to

<a href="http://nginx.org/">nginx.org</a>.<br/>

Commercial support is available at

<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>

</body>

</html>
  • You received the output because all Pods running the application are associated with the security group that you created. That group contains a rule that allows all traffic between all Pods that the security group is associated with. DNS traffic is allowed outbound from that security group to the cluster security group, which is associated with your nodes. The nodes are running the CoreDNS Pods, which your Pods did a name lookup to.

Decoding the packet path

Let’s take a closer look at the packet path for this scenario.

Service IP

kubectl get svc -A
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default kubernetes ClusterIP 10.100.0.1 <none> 443/TCP 40m
eks-cilium-chain eks-cilium-chain ClusterIP 10.100.158.106 <none> 80/TCP 8m22s
kube-system hubble-peer ClusterIP 10.100.118.189 <none> 443/TCP 26m
kube-system kube-dns ClusterIP 10.100.0.10 <none> 53/UDP,53/TCP 40m

Client

  • Taking a closer look at the Client side IP address: ( this is the branch interface IP- see the AWS screenshot above)
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 9001
inet 192.168.190.9 netmask 255.255.255.255 broadcast 0.0.0.0
inet6 fe80::44b3:61ff:fe02:74fe prefixlen 64 scopeid 0x20<link>
ether 46:b3:61:02:74:fe txqueuelen 0 (Ethernet)
RX packets 804 bytes 11568672 (11.0 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 574 bytes 46732 (45.6 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
  • ARP table entries
root@eks-cilium-chain-6684995c8d-whbgs:/# arp
Address HWtype HWaddress Flags Mask Iface
169.254.1.1 ether fe:e4:33:95:06:d5 CM eth0
169.254.1.1 ether fe:e4:33:95:06:d5 CM eth0
  • Client sends a curl request towards the server
curl eks-cilium-chain
  • Tcpdump on the server
04:32:07.242540 IP 169.254.42.1.41794 > eks-cilium-chain-6684995c8d-whbgs.80: Flags [P.], seq 1:81, ack 1, win 491, options [nop,nop,TS val 3199320757 ecr 1137746092], length 80: HTTP: GET / HTTP/1.1
04:32:07.242546 IP eks-cilium-chain-6684995c8d-whbgs.80 > 169.254.42.1.41794: Flags [.], ack 81, win 489, options [nop,nop,TS val 1137746093 ecr 3199320757], length 0
04:32:07.242553 IP eks-cilium-chain.eks-cilium-chain.svc.cluster.local.80 > eks-cilium-chain-6684995c8d-whbgs.41794: Flags [.], ack 81, win 489, options [nop,nop,TS val 1137746093 ecr 3199320757], length 0
04:32:07.242716 IP eks-cilium-chain-6684995c8d-whbgs.80 > 169.254.42.1.41794: Flags [P.], seq 1:239, ack 81, win 489, options [nop,nop,TS val 1137746093 ecr 3199320757], length 238: HTTP: HTTP/1.1 200 OK

AWS Node

  • Taking a closer look at the AWS node provides us more information into how networking is being done.
  • ENI Interface
eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 9001
inet 192.168.186.196 netmask 255.255.224.0 broadcast 192.168.191.255
inet6 fe80::893:51ff:fe3e:1804 prefixlen 64 scopeid 0x20<link>
ether 0a:93:51:3e:18:04 txqueuelen 1000 (Ethernet)
RX packets 8120 bytes 11965334 (11.4 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 537 bytes 42684 (41.6 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
  • Trunk Interface
    -Note-This device is used only by this branch interface pod and not shared with any other pods on the host.
vlanc3aa50bbbad@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc noqueue state UP group default qlen 1000
link/ether fe:e4:33:95:06:d5 brd ff:ff:ff:ff:ff:ff link-netns cni-7844f809-9291-09b6-e5ca-368d5b777721
inet6 fe80::fce4:33ff:fe95:6d5/64 scope link
valid_lft forever preferred_lft forever
  • Branch Interface
    -Note-Notice the mac-address from AWS screenshot above.
vlan.eth.1@eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc noqueue state UP group default qlen 1000
link/ether 0a:47:af:ca:fe:dc brd ff:ff:ff:ff:ff:ff
inet6 fe80::847:afff:feca:fedc/64 scope link
valid_lft forever preferred_lft forever
  • Tcpdump on the AWS node reveals the communication happening over the VLAN interface with
    -Client IP= 192.168.190.9
    -Service IP=10.100.158.106
listening on vlanc3aa50bbbad, link-type EN10MB (Ethernet), capture size 262144 bytes
04:32:07.242237 IP ip-192-168-190-9.ap-southeast-2.compute.internal.crestron-cip > ip-10-100-158-106.ap-southeast-2.compute.internal.http: Flags [S], seq 4090770323, win 62727, options [mss 8961,sackOK,TS val 3199320756 ecr 0,nop,wscale 7], length 0
04:32:07.242540 IP 169.254.42.1.crestron-cip > ip-192-168-190-9.ap-southeast-2.compute.internal.http: Flags [P.], seq 1:81, ack 1, win 491, options [nop,nop,TS val 3199320757 ecr 1137746092], length 80: HTTP: GET / HTTP/1.1
04:32:07.242547 IP ip-192-168-190-9.ap-southeast-2.compute.internal.http > 169.254.42.1.crestron-cip: Flags [.], ack 81, win 489, options [nop,nop,TS val 1137746093 ecr 3199320757], length 0
04:32:07.242552 IP ip-10-100-158-106.ap-southeast-2.compute.internal.http > ip-192-168-190-9.ap-southeast-2.compute.internal.crestron-cip: Flags [.], ack 81, win 489, options [nop,nop,TS val 1137746093 ecr 3199320757], length 0
04:32:07.242717 IP ip-192-168-190-9.ap-southeast-2.compute.internal.http > 169.254.42.1.crestron-cip: Flags [P.], seq 1:239, ack 81, win 489, options [nop,nop,TS val 1137746093 ecr 3199320757], length 238: HTTP: HTTP/1.1 200 OK
  • Some more interesting details from a Cilium perspective
[root@ip-192-168-174-233 /]# tc filter show dev vlanc3aa50bbbad@if3 ingress
filter protocol all pref 1 bpf chain 0
filter protocol all pref 1 bpf chain 0 handle 0x1 cil_from_container-vlanc3aa50bbbad direct-action not_in_hw id 540 tag d2b8399c8600f69f jited

[root@ip-192-168-174-233 /]# tc filter show dev vlanc3aa50bbbad@if3 egress
filter protocol all pref 1 bpf chain 0
filter protocol all pref 1 bpf chain 0 handle 0x1 cil_to_container-vlanc3aa50bbbad direct-action not_in_hw id 538 tag ccfdaf679ca3f11b jited

[root@ip-192-168-174-233 /]# ip route show table main
default via 192.168.160.1 dev eth0
169.254.169.254 dev eth0
192.168.160.0/19 dev eth0 proto kernel scope link src 192.168.174.233
192.168.190.9 dev vlanc3aa50bbbad scope link

[root@ip-192-168-174-233 /]# ip rule
9: from all fwmark 0x200/0xf00 lookup 2004
10: from all iif vlan.eth.1 lookup 101
10: from all iif vlanc3aa50bbbad lookup 101
20: from all lookup local
100: from all lookup local
1024: from all fwmark 0x80/0x80 lookup main
32766: from all lookup main
32767: from all lookup default

[root@ip-192-168-174-233 /]# ip route show table 101
default via 192.168.160.1 dev vlan.eth.1
192.168.160.1 dev vlan.eth.1 scope link
192.168.190.9 dev vlanc3aa50bbbad scope link

API call towards branch interface

  • If you like working with API’s and use Podman then you can issue a standard API call and fetch information about the Branch Interface as well
{
"AvailabilityZone": "ap-southeast-2c",
"Description": "aws-k8s-branch-eni",
"Groups": [
{
"GroupName": "netpol-chain-sg",
"GroupId": "sg-01b33d4bf8ca28b86"
}
],
"InterfaceType": "branch",
"Ipv6Addresses": [],
"MacAddress": "0a:47:af:ca:fe:dc",
"NetworkInterfaceId": "eni-0b7e72e7332cbd210",
"OwnerId": "679388779924",
"PrivateDnsName": "ip-192-168-190-9.ap-southeast-2.compute.internal",
"PrivateIpAddress": "192.168.190.9",
"PrivateIpAddresses": [
{
"Primary": true,
"PrivateDnsName": "ip-192-168-190-9.ap-southeast-2.compute.internal",
"PrivateIpAddress": "192.168.190.9"
}
],
"RequesterId": "720728609012",
"RequesterManaged": false,
"SourceDestCheck": true,
"Status": "in-use",
"SubnetId": "subnet-032358f8adac3f752",
"TagSet": [
{
"Key": "eks:eni:owner",
"Value": "eks-vpc-resource-controller"
},
{
"Key": "vpcresources.k8s.aws/trunk-eni-id",
"Value": "eni-062a475f4acdc79c2"
},
{
"Key": "kubernetes.io/cluster/cluster1",
"Value": "owned"
},
{
"Key": "vpcresources.k8s.aws/vlan-id",
"Value": "1"
}
],
"VpcId": "vpc-046ebc1d554831a81"
}
]
}

Some common Error Messages

Few issues to keep in mind while running the demo

  • While creating the security group policy if the security group ID is added incorrectly; application pods will not come up and the user would see an error message as below.
Error Message

Warning BranchAllocationFailed 85s (x3 over 85s) vpc-resource-controller (combined from similar events): failed to allocate branch ENI to pod: InvalidParameterValue: Security group: "sg-0ccf737ba09e2015f" is not contained in the parent network of "subnet-0f6733ad44c138322"
status code: 400, request id: 4371c93e-1901-4bb1-8528-5abfe6569dc9
Normal SecurityGroupRequested 85s (x12 over 87s) vpc-resource-controller Pod will get the following Security Groups [sg-0ccf737ba09e2015f]
Warning FailedCreatePodSandBox 74s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "b8f0c08ea730cc31d5c53b3a8bda48c9f3eef20d83c30e7252322aa91b70e4e9": plugin type="aws-cni" name="aws-cni" failed (add): add cmd: failed to assign an IP address to container
Warning FailedCreatePodSandBox 61s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "05a2bbd6e208ba788ec81ec86addea0b6647aefc759cc140cf1f6930da4de02d": plugin type="aws-cni" name="aws-cni" failed (add): add cmd: failed to assign an IP address to container
  • If you see “Insufficient permissions: Unable to create Elastic Network Interface.”, confirm that you added the IAM policy to the IAM cluster role.

References

Try out Cilium

  • Try out Cilium and get a first-hand experience of how it solves some real problems and use-cases in your cloud-native or on-prem environments related to Networking, Security or Observability.

🌟Conclusion 🌟

Hopefully, this post gave you a good overview of how Cilium works alongside Security Groups in EKS. Thank you for Reading !! 🙌🏻😁📃, see you in the next blog.

🚀 Feel free to connect/follow with me/on :

LinkedIn: linkedin.com/in/agamitgupta

--

--