Sitemap

Simplifying Network Management with Cilium’s BGP Auto-Discovery

5 min readJun 23, 2025

Large networks with Kubernetes environments must be both robust and scalable. Traditionally, Cilium’s BGP implementation required users to explicitly specify peer IP addresses in BGP cluster configurations to establish BGP sessions with Top-of-Rack (ToR) switches. While this approach functions adequately in small environments, it becomes difficult to manage for large-scale deployments involving thousands of Kubernetes nodes distributed across numerous racks. Managing BGP configuration files for large clusters significantly increases operational complexity and overhead. My recent work on the Cilium project introduced a new feature called BGP Auto-Discovery that addresses this challenge to streamline BGP cluster configuration and enhance operational efficiency and reliability. In this post, we’ll explore this feature, examine how it works, and provide practical steps for effective implementation.

What is BGP Auto-Discovery in Cilium?

In Cilium, BGP primarily serves to establish sessions with ToR switches for advertising and receiving routes. Traditionally, each ToR switch required explicit configuration in the BGP cluster setup, creating the operational overhead described above.

The auto-discovery feature enables Cilium’s BGP control plane to automatically discover BGP peers without providing the peer IP explicitly. Currently, Cilium supports DefaultGateway mode for peer auto-discovery. This mode identifies the node’s default gateway for the specified address family (IPv4 or IPv6) and automatically establishes a BGP session with the discovered gateway.

Here’s how you can configure auto-discovery in your Cilium setup:

Cilium Cluster Configuration (subset of cluster config)

peers:
- name: "tor-switch"
peerASN: 65000
autoDiscovery:
mode: "DefaultGateway"
defaultGateway:
addressFamily: ipv6 # or "ipv4"
peerConfigRef:
name: "cilium-peer"

This configuration becomes even more streamlined when you enable “BGP listen-range” configuration on the ToR switch, eliminating the need to specify individual IP addresses for Cilium nodes on the ToR switch. The ToR switch will listen to an entire subnet, and when it receives a BGP message from any address within that subnet, it will process the message to establish the BGP session. Additionally, ensure ASN consistency across all ToR switches to maintain uniform BGP cluster configuration in Cilium.

# frr configuration
router bgp 65100
neighbor CILIUM peer-group
neighbor CILIUM local-as 65000 no-prepend replace-as
bgp listen range fd00:10:0:1::/64 peer-group CILIUM

“neighbor CILIUM local-as 65000 no-prepend replace-as” configuration allows the ToR switch to present a consistent ASN (65000) to all Cilium nodes while maintaining its actual routing domain identity (65100)

Running Cilium with BGP Auto-Discovery on Minikube

You can experiment with Cilium’s BGP Auto-discovery feature without needing access to a large production cluster. By leveraging minikube, you can easily set up a multi-node Kubernetes environment directly on your laptop for testing and development purposes.

Prerequisites

Before proceeding, ensure you have access to a machine running Linux Server. Note that all commands and installation steps in this guide are tested on Ubuntu.

You will need:

Linux Server with sudo privileges

Docker to run Containerlab and FRR containers

Internet access for downloading software and Docker images

Sufficient resources:

  • Minimum 4 vCPUs and 8 GB RAM (for running Minikube, Containerlab, and Cilium components comfortably)

Installing Components of our setup

Install minikube

minikube will be used to create kubernetes cluster

curl -LO https://github.com/kubernetes/minikube/releases/latest/download/minikube-linux-amd64
sudo install minikube-linux-amd64 /usr/local/bin/minikube && rm minikube-linux-amd64

Install cilium cli

cilium-cli will be used to install cilium on to the nodes and run bgp peer commands

CILIUM_CLI_VERSION=$(curl -s https://raw.githubusercontent.com/cilium/cilium-cli/main/stable.txt)
CLI_ARCH=amd64
if [ "$(uname -m)" = "aarch64" ]; then CLI_ARCH=arm64; fi
curl -L --fail --remote-name-all https://github.com/cilium/cilium-cli/releases/download/${CILIUM_CLI_VERSION}/cilium-linux-${CLI_ARCH}.tar.gz{,.sha256sum}
sha256sum --check cilium-linux-${CLI_ARCH}.tar.gz.sha256sum
sudo tar xzvfC cilium-linux-${CLI_ARCH}.tar.gz /usr/local/bin
rm cilium-linux-${CLI_ARCH}.tar.gz{,.sha256sum}

You can also use Helm to install Cilium on your nodes as an alternative to using cilium-cli

Install kubectl

kubectl will be used to interact and manage the kubernetes cluster

curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl

Install Containerlab

containerlab will be used to simulate the network

curl -sL https://containerlab.dev/setup | sudo -E bash -s "all"

If you want to run Docker without using sudo, ensure your user is added to the docker group

sudo groupadd docker
sudo usermod -aG docker $USER
newgrp docker

Running BGP Auto-Discovery

Run Cilium on Minikube

For this walkthrough, create Minikube cluster with one node

minikube start --cni=false
kubectl get pods -n kube-system

Install Cilium 1.18.0 (pre-release version) on your Minikube cluster (Cilium 1.18.0 production release — 29th July 2025)

cilium install --version 1.18.0-pre.3 --set bgpControlPlane.enabled=true
kubectl -n kube-system rollout status ds/cilium

Cilium is being deployed as a Daemonset so every node in the cluster will have cilium running on it. In our case, we have only one node.

Run Containerlab

We will be using containerlab to simulate a network. To keep it simple, we will have one router connected to a server

Containerlab Topology File (topology.yaml)

name: bgp-autodiscovery
topology:
nodes:
# A simple BGP router that peers with Cilium with eBGP.
router0:
kind: linux
image: frrouting/frr:latest
cmd: bash
exec:
- sysctl net.ipv6.conf.all.forwarding=1
- ip addr add fd00:10:0:1::1/64 dev net0
- touch /etc/frr/vtysh.conf
- touch /var/log/frr.log
- chown frr:frr /var/log/frr.log
- sed -i -e 's/bgpd=no/bgpd=yes/g' /etc/frr/daemons
- /usr/lib/frr/frrinit.sh start
- >-
vtysh -c 'conf t'
-c 'router bgp 65010'
-c ' neighbor CILIUM peer-group'
-c ' neighbor CILIUM remote-as 65001'
-c ' neighbor CILIUM local-as 65000 no-prepend replace-as'
-c ' bgp listen range fd00:10:0:1::/64 peer-group CILIUM'
-c '!'
server0:
kind: linux
image: nicolaka/netshoot:v0.11
network-mode: container:minikube
exec:
- ip addr add fd00:10:0:1::2/64 dev net0
- ip route add fd00::/16 via fd00:10:0:1::1 dev net0
- ip route replace default via fd00:10:0:1::1 dev net0
links:
- endpoints: ["router0:net0", "server0:net0"]

Deploy your topology using Containerlab

containerlab deploy -t topology.yaml

The next step is to apply the BGP configuration to the Cilium node. The BGP control plane is enabled, but there is no configuration yet to establish BGP sessions.

Apply Cilium BGP Configuration

Cilium BGP configuration (bgp.yaml)

apiVersion: cilium.io/v2
kind: CiliumBGPClusterConfig
metadata:
name: cilium-bgp
spec:
bgpInstances:
- name: "65001"
localASN: 65001
peers:
- name: "65000"
peerASN: 65000
autoDiscovery:
mode: "DefaultGateway"
defaultGateway:
addressFamily: ipv6
peerConfigRef:
name: "cilium-peer"
---
apiVersion: cilium.io/v2
kind: CiliumBGPPeerConfig
metadata:
name: cilium-peer
spec:
gracefulRestart:
enabled: true
restartTimeSeconds: 15
families:
- afi: ipv4
safi: unicast
advertisements:
matchLabels:
advertise: "bgp"
- afi: ipv6
safi: unicast
advertisements:
matchLabels:
advertise: "bgp"

Apply BGP configuration to cilium node

kubectl apply -f bgp.yaml

Validate BGP Session

FRR Router

docker exec -it clab-bgp-autodiscovery-router0 vtysh -c "show bgp summary"

Cilium Node

# you need cilium-cli installed to run the following command
cilium bgp peers

Great job! You’ve configured BGP peering with your router using auto-discovery in DefaultGateway mode, eliminating the need to explicitly provide the peer IP address.

Cleanup

Shutdown the lab

containerlab destroy -t topology.yaml

Delete Minikube cluster

minikube delete

Multi-homing Considerations

In setups with multiple default gateways (multi-homing), Cilium intelligently selects the default gateway based on the lowest priority. However, note that:

  • Only one BGP session per address family is active at any time.
  • If the primary gateway route fails, Cilium automatically reconciles to establish a session with the alternative gateway.

Limitations and Workarounds

Default Gateway Auto-Discovery doesn’t support multiple concurrent BGP sessions per address family in multihoming scenarios. The existing workaround involves manually configuring peer addresses.

Conclusion

Cilium’s new BGP Auto-Discovery feature greatly simplifies network automation in Kubernetes environments. It reduces complexity, enhances reliability, and accelerates operational workflows. Leveraging this capability helps network engineers build more robust and agile infrastructures.

--

--

No responses yet