Episode XII: Scalable & Exposable iPAM with Multus

Published in

Open 5G HyperCore

10 min readFeb 16, 2023

Authors: Fatih Nar The Architect; Doug Smith Principal Software Engineer; Tomofumi Hayashi Principal Software Engineer,

1.0 Introduction

Previously in Episode-III, we concluded the Multus CNI section with “to leverage multiple networks with your containers (sitting in the same pod), first, you need to plan, design, and then implement your networking fabric”. In this article, we will try to find possible solutions to address IP address management challenges with Multus add-on CNI interfaces (i.e., NetworkAttachments), evaluate them as much as we can, come up with educated & tested recommendations, and may steer some new development ideas to related CNCF working groups.

Before we delve into specifics on multus and primary/non-primary cni interface ipam & how to use them, let’s focus on what scaling can mean per CNF in the 5G Core application set, as each 5G application may have different interfaces (due to interconnected peer variances), constraints per function set, as they are supposed to deliver/implement within.

We can “try” to group some of the 5G applications (ex AMF, SMF, and UPF) together, where different interfaces (eth0, net1, net2, etc) are being used towards RAN/UE and among other CNFs, and the rest of the 5g applications (ex AUSF, UDM, PCF, etc) are pretty much only in need of HTTP2/JSON communication type which can be served over default single primary cni (eth0).

2.0 Solution

IP address management with Multus interfaces can be implemented in various ways, such as;

Host-Local: IP allocation management within specified ranges on a “single node” only; storage of IP address allocation is specified in flat files local to each host; hence it is called host-local.
DHCP: IP allocated by an “external” DHCP server.
Whereabouts: IP allocation management within specified ranges across a Kubernetes cluster. Storage of IP address allocation is specified in Kubernetes Custom Resources to enable the allocation of an IP across any host in a cluster.
Static: IP allocated statically per POD at deployment time.

Quick Evaluation:

Option (I) and/or (IV) do NOT address the need for Horizontal POD Autoscaling (HPA) across different nodes, as each time pod scales, it will “try” to use the same static configuration that will cause overlapping ip address issues.
Option (II) needs an external DHCP server per CIDR that POD(s) attach to, which can be costly (+ operational overhead) to deal with outside the Kubernetes cluster as these networks may likely stay private within the application platform network scope, however still a valid option for horizontal scaling in the realm of external provider networks.
Option III “Whereabouts” Let’s dig into what it is and IF/HOW we can use it for 5g core deployment with 5g CNF pod autoscaling with multiple pod interfaces.

Whereabouts take an address range and assign IP addresses within that range; when an IP address is assigned to a pod, Whereabouts tracks that IP address in a data store for the lifetime of that pod. When the pod is removed, Whereabouts then frees the address and makes it available to assign on subsequent requests. Whereabouts always assign the lowest value address that’s available in the range.

Sample Network Attachment Definition:

apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: whereabouts100
spec:
  config: '{
      "cniVersion": "0.3.0",
      "name": "was-i2i5gcore-192.168.100",
      "type": "macvlan",
      "master": "ens224",
      "mode": "bridge",
      "ipam": {
        "type": "whereabouts",
        "range": "192.168.100.0/24",
        "gateway": "192.168.100.1",
        "exclude": [
           "192.168.100.1/32"
        ]
      }
    }'

apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: whereabouts200
spec:
  config: '{
      "cniVersion": "0.3.0",
      "name": "was-i2i5gcore-192.168.200",
      "type": "macvlan",
      "master": "ens256",
      "mode": "bridge",
      "ipam": {
        "type": "whereabouts",
        "range": "192.168.200.0/24",
        "gateway": "192.168.200.1",
        "exclude": [
           "192.168.200.1/32"
        ]
      }
    }'

Sample Debug POD Definition:

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    name: debug-app-fenar2
  name: debug-app-fenar-2
spec:
  replicas: 1
  selector:
    matchLabels:
      name: debug-app-fenar2
  template:
    metadata:
      annotations:
        k8s.v1.cni.cncf.io/networks: whereabouts200
      labels:
        name: debug-app-fenar2
    spec:
      nodeSelector:
        worker.node: '2'
      containers:
      - image: quay.io/narlabs/debugpod:latest
        name: debug-app-fenar2
        command: [ "/bin/bash", "-c", "--" ]
        args: [ "while true; do sleep 30; done;" ]
        resources:
          requests:
            cpu: 500m
            memory: 768Mi
            ephemeral-storage: 1Gi
          limits:
            cpu: 900m
            memory: 900Mi
            ephemeral-storage: 2Gi
      imagePullSecrets:
        - name: narlabs-i2i-robot-pull-secret


apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    name: debug-app-fenar1
  name: debug-app-fenar-1
spec:
  replicas: 1
  selector:
    matchLabels:
      name: debug-app-fenar1
  template:
    metadata:
      annotations:
        k8s.v1.cni.cncf.io/networks: whereabouts100
      labels:
        name: debug-app-fenar1
    spec:
      containers:
      - image: quay.io/narlabs/debugpod:latest
        name: debug-app-fenar1
        command: [ "/bin/bash", "-c", "--" ]
        args: [ "while true; do sleep 30; done;" ]
        resources:
          requests:
            cpu: 500m
            memory: 768Mi
            ephemeral-storage: 1Gi
          limits:
            cpu: 900m
            memory: 900Mi
            ephemeral-storage: 2Gi
      imagePullSecrets:
        - name: narlabs-i2i-robot-pull-secret

POD to POD access verification on different nodes:

2.1 Multus (Add-On Networks) Service Discovery

Now that we have solved (almost, more details to come later for multi-cluster scenarios) the automated & dynamic way of getting IP addresses to CNF POD supplementary (i.e., add-on) network interfaces (i.e., multus non-primary cni usage), we need to let the external world (i.e., surrounding CNFs) know this network access (i.e., expose them as a consumable service).

K8s POD IP address associated with a primary CNI (e.g. for default network traffic over eth0) is only valid within the tenant namespace on the same cluster scope (i.e., Pod to Pod communication only within a namespace scope).

The k8s native way of exposing a pod capability outside of the tenant namespace is a native k8s service construct. A K8s Service is an abstraction to expose an application running on a set of Pods — the network traffic to that set of pods then load balanced across those pods. Conveniently, you can also refer to that service by DNS name (as opposed to an IP address), which aids when developing applications that run on Kubernetes.

PS: Do not forget that; for the sake of external service continuity in case of service traverse path changes (i.e. serving pod alter/vary over time for same consumer/endpoint); there has to be an application layer service/context stickiness implementation in place between pods in the replica set (ex use of shared in-memory database -imdb- over the network).

So far, so good? We have bad news for you; there is no officially supported k8s native way of exposing a multus add-on interface(s) hosted service(s) yet.

What now?

Use cutting-edge developer preview technology (Multus-Service).
Visit 3GPP-defined procedures that can offer 5G CNF registration & discovery procedures.
Use an external load balancer as a physical or a virtual appliance leveraging single point entry to sctp service domain (ie vip address).

2.1.1 Multus-Service (Our Preferred Solution)

Multus-Service (GitHub) is a Kubernetes service used with pod’s add-on interfaces; it is an alternative controller which handles Kubernetes endpointslices, using a label to tell Kubernetes not to process these endpointslices with the default controller, we can have the service handled by an external controller — in this case, Multus-service. For the details, such as how to install or how it works, please look into “How to Use Kubernetes Services on Secondary Networks with Multus CNI”.

Multus-service is modular; the initial module for Multus-proxy uses iptables to assert connectivity to a set of endpointslices. This is similar to the implementation of the kube-proxy; however, from an advanced networking perspective, it comes with some limitations and restrictions (as being at the developer preview stage):

It requires iptables to be used; this means that while the multus-service roadmap plan to extend the functionality of multus-service, today it has to use kernel networking. To work with user-space networking (such as DPDK, or XDP) multus-service will be implementing modules that can handle this specific type of traffic. It also requires the use of IPv4 at the developer preview stage.

Sample AMF K8s Multus Service Definition:

kind: Service
apiVersion: v1
metadata:
  name: sctp-amf-multus-service
  labels:
    service.kubernetes.io/service-proxy-name: multus-proxy
  annotations:
    k8s.v1.cni.cncf.io/service-network: whereabouts100
spec:
  type: NodePort
  selector:
    nf-type: amf
  ports:
  - protocol: SCTP
    port: 38412
    targetPort: 38412
    nodePort: 30414

Sample AMF K8s Service Details with selected AMF PODs:

Figure-6 Ingress SCTP Traffic Flow via K8s Service Exposure for Add-On Interface with Whereabouts

2.1.2 3GPP Approach

The Challenge/Ask here is specifically with AMF PODs, mainly about increasing the SCTP-Endpoint count of AMF so the new gNB/UE registrations can be handled with new AMF PODS. The new coming gNB/UE(s) need to be provisioned with:

[B.1] New AMF-ID for explicitly pointing SCTP traffic to new AMF POD over service provider network attached over add-on interface; this is a high operational cost approach and also prone to many problems.

Use single point entry approach;

[B.2] Virtual IP (VIP) approach; new AMF POD perform G-ARP to take over the VIP. We can reserve the VIP address in the whereabouts, excluding the config part. This is an active-standby approach rather than scaling a capacity.
[B.3] Use of External DNS hosted AMF-ID FQDN to resolve to multiple AMF POD destinations. This approach would be based on using 5G RAN as a compatible DNS notification mechanism, where the RAN will subscribe to the DNS list and, thus, will be notified of the creation and the deletion of a given AMF PODs, details for scaling:

1) Scale-Up: When the control system running in the Service Orchestrator (SOr) detects the need for a scale-out for the AMF, so increases the AMF Replica Set (RS), and a new AMF POD comes to life. In addition, it notifies the DNS of the creation of the additional AMF POD. In turn, the DNS pushes the new AMF POD IP to the RAN. In such a case, the control system pushes new relative capacity for each AMF POD if needed.

2) Scale-Down: When the control system running in the Service Orchestrator (SO) detects a possibility of a scale in the SO decreases the AMF RS count. AMF POD gets terminated by the k8s control plane, SO updates the DNS with the removal of the AMF POD IP. Consequently, the RAN will not send any new procedure to this AMF POD instance.

For more information, please check out our Cloud Burst Article (Episode-VII) Section 2.3 Burst The Traffic.

Associated References from 3GPP Technical Specifications:

TS 23.501

[Section 5.9.5] AMF ID/Naming convention and id-parts & purposes..
[Section 6.3.5] AMF Discovery & Selection.

TS 29.303

[Section 7.2] Procedure for AMF Discovery by 5G-AN via DNS.

2.1.3 External LB Appliance/Service

This is an approach similar to multus-service, but in a way, you re-inventing the service ingress for the particular pod traffic (ie sctp) with an extra appliance (ex LoxiLB), which brings additional complexity to solution design, deployment, and life cycle management.

2.2 Multi-Cluster Coverage with Whereabouts

Whereabouts could certainly work with a multi-cluster environment — if we build (Can we build it? Yes we can! Bob The Builder) the right shared data store to use across clusters. Currently — Whereabouts uses Kubernetes Custom Resources to store the data that represents the allocated IP addresses.

The two options as we see it are to have a way for Kubernetes multi-cluster environments to have Custom Resources that are shared across clusters. Or, we could have a key-value store (KVS) available across clusters (preferred way):

2.2.1 Use of CRD

The Kubernetes API is a bit of a bottleneck for Whereabouts as it stands because of the need to use a Kubernetes Lease object to choose a leader (to prevent write contention). There’s a lot of overhead for acquiring the lease — and the Custom Resource objects are less efficient to walk through to allocate; this burns a lot of time and CPU cycles to achieve.

2.2.2 Use of Key-Value Store (Preferred Approach)

With The use of a dedicated key-value store (KVS), we can buy a lot of efficiencies (and therefore scale) for Whereabouts. Additionally — if this KVS is available across clusters, we can expand the control of Whereabouts software-defined networking over multiple clusters with high availability.

The challenges lie in finding the right KVS (ex, Redis, Hazelcast, VoltDB, etc.) and having an “automagic” configuration and management of that KVS. This KVS will need to have an operator that gives an opinionated installation and lifecycle management for the KVS.

If a KVS available across clusters were available with a proper operator, this would be the ideal solution. Whereabouts would also need to be modified to support multiple backends — both the KVS and the K8s API via CRDs. The maintainers have already voiced that they’d like to create a layer within Whereabouts that enables extensions to be written for Whereabouts supporting multiple data stores.

3.0 Summary

Alright, if we need to sum up what we have talked about,

[I] Ip address management for additional interfaces of microservices hosted on Kubernetes is possible with the whereabouts

[II] Making additional interfaces reachable for external traffic is possible with multus-service as the preferred approach due to simplicity in design, operation and being the most cloud-native approach. Alternatives; using external load-balancing cnf appliance or the worst 3gpp spaghetti approach for service scaling with external procedures.

[III] Enlarging the scope/coverage of whereabouts as multi-cluster multus sdn can be doable with add-on developments.

Call for Action: We are actively seeking solution partners for [III] to develop, test, and upstream a full working solution. We need key-value (KVS) store software partners to implement the distributed KVS with near-real-time synchronization among them. Please reach out to Fatih Nar over LinkedIn.

Episode XII: Scalable & Exposable iPAM with Multus

Written by Fatih Nar