GKE’s “cluster-ipv4-cidr” flag

Daz Wilkin
5 min readJul 25, 2017

--

‘“Ahhh, this porridge is just right,” she said happily and she ate it all up.’
The Story of Goldilocks and the Three Bears

When creating a cluster on Google Container Engine (GKE), it is possible to specify a user-defined IP address range for the cluster with the flag cluster-ipv4-cidr. The range may be specified by a CIDR between /8 (==16,777,216 IPs) and /19 (==8,192 IPs). If unspecified, a default range of /14 (==262,144 IPs) is used.

It is important to choose a “Goldilocks” size for your cluster because it is not currently possible to adjust the size of the IP address range after the cluster has been created: if your cluster is too large, you may hoard IP addresses that go unused; if your cluster is too small, you may outgrow your IP address range and this may lead you to relocate to a replacement, larger cluster.

So, how does GKE networking map to Google Compute Engine (GCE) Virtual Private Cloud (VPC) networking? And, how does GKE utilize the IP address range assigned to it?

GCE VPC networking (“GCE networks”) are summarized here. Suffice it to say that one GCE network comprises one private (RFC 1918) address space and, while there are 3 blocks assigned to RFC 1918, commonly customers use 10/8 which corresponds to 10.0.0.0–10.255.255.255 (almost 17 million IPs).

When a GKE cluster is created on a GCE network, a block of IP addresses is allocated to the GKE cluster for all the cluster’s IP needs: nodes, pods, services. It may not be apparent but this IP address space must not be allocated to an existing subnet on this GCE network:

Example #1: GKE cluster IP address space must not be already allocated on the GCE network

$CLUSTER=my-cluster
$NETWORK=my-network
$PROJECT=my-project
$REGION=us=west1
$SUBNET=my-subnet
$ZONE=us-west1-c
gcloud compute networks create ${NETWORK} \
--project=${PROJECT} \
--mode=custom
gcloud compute networks subnets create ${SUBNET} \
--project=${PROJECT} \
--network=${NETWORK} \
--region=${REGION} \
--range=10.0.0.0/9

$SUBNET uses the 50% of the RFC 1918 10/8 address space. There’s no space between 10.0.0.0–10.127.255.255 for the GKE cluster. 10.1.0.0/19 overlaps with this address space and so the cluster creation fails:

gcloud container clusters create ${CLUSTER} \
--project=${PROJECT} \
--zone=${ZONE} \
--network=${NETWORK} \
--subnetwork=${SUBNET} \
--cluster-ipv4-cidr=10.1.0.0/19

results in:

ERROR:
Requested CIDR 10.1.0.0/19 for containers is not available in network “${NETWORK}” for cluster.

Example #2: How to determine the GKE cluster IP address space

Let’s create the cluster in a non-assigned address space. We are able to use anything above 10.127.255.255 in the network created in the previous example:

gcloud container clusters create ${CLUSTER} \
--project=${PROJECT} \
--zone=${ZONE} \
--network=${NETWORK} \
--subnetwork=${SUBNET} \
--cluster-ipv4-cidr=10.128.0.0/19

The cluster runs as nodes on GCE VMs. The GCE VMs consume IP addresses (one IP per VM) on the network and subnet referenced during the cluster creation. This subnet is 10.0.0.0/9 (10.0.0.0–10.127.255.255). Let’s enumerate the GKE Nodes (== GCE VMs):

gcloud compute instances list \
--project=${PROJECT}
NAME ZONE INTERNAL_IP STATUS
84cdad0f-0t68 us-west1-c 10.0.0.2 RUNNING
84cdad0f-t9hb us-west1-c 10.0.0.3 RUNNING

NB the VMs (==GKE Nodes) have IP addresses drawn from the subnet 10/9

But, what of the IP addresses that will be used by the GKE cluster for its pods, services etc.? It is important to understand that these IP addresses are not drawn from the subnet’s IP address space (10.0/9) but from the address space that was allocated to the cluster during its creation (10.128/19).

Let’s confirm. A block of IP addresses is pre-allocated to each Node in the cluster. For GKE this defaults to a /24 block (==256 IPs). Let’s check the Node (84cdad0f-0t68) that appears in the list above. The following is edited for clarity:

kubectl describe node/84cdad0f-0t68Name:  84cdad0f-0t68
Addresses:
InternalIP: 10.0.0.2
Hostname: 84cdad0f-0t68
PodCIDR: 10.128.1.0/24

NB
— the InternalIP — as expected — matches the VM’s InternalIP
PodCIDR is drawn from the cluster’s (!) IP address range 10.128.1.0/24 == 10.128.1.0–10.128.1.255

The PodCIDR range is fixed on GKE. Each Node in the cluster will be given a /24 network (==256 IPs). These addresses are bound to the Node and may not be assigned elsewhere in the cluster (nor in the GCE network).

What other addresses does GKE use? It uses IP addresses for Services. We can determine which IP address range is allocated by the cluster for use in Services with:

gcloud container clusters describe ${CLUSTER} \
--project=${PROJECT} \
--zone=${ZONE}
clusterIpv4Cidr: 10.128.0.0/19
servicesIpv4Cidr: 10.128.16.0/20
status: RUNNING
subnetwork: us-west
zone: us-west1-b

NB
clusterIpv4Cidr is the range specified when we created the cluster
servicesIpv4Cidr is a /20 subset (==4096 IPs) of this space for Services

Math Test #1: How many Nodes can this cluster support?

The cluster CIDR is 10.128.0.0/19 which provides 8192 IP addresses

The cluster pre-allocates 10.128.16.0/20 (==4096 IPs) for Services

This leaves 4096 (8192–4096) IP address for everything else (Nodes).

Each Node is allocated /24 IP address range which is 256 IP addresses

So, the cluster has sufficient IPs for (8192–4096)/256 == 16 nodes.

Let’s prove our math:

gcloud container clusters resize ${CLUSTER} \
--project=${PROJECT} \
--zone=${ZONE} \
--size=16

And:

gcloud container clusters describe ${CLUSTER} \
--project=${PROJECT} \
--zone=${ZONE} \
--format="json" \
| jq '.currentNodeCount'
16

Now let’s go one Node beyond what should be permitted:

gcloud container clusters resize ${CLUSTER} \
--project=${PROJECT} \
--zone=${ZONE} \
--size=17

And:

gcloud container clusters describe ${CLUSTER} \
--project=${PROJECT} \
--zone=${ZONE} \
--format="json" \
| jq '.currentNodeCount'
16

NB Even though 17 nodes are requested, the cluster size cannot exceed 16 nodes

I’m a fan of jq but, if you don’t (want to) have it installed, here’s a pure gcloud command for the querying the cluster’s node count:

gcloud container clusters describe ${CLUSTER} \
--project=${PROJECT} \
--zone=${ZONE} \
--format="value(currentNodeCount)"

Considerations

It is common for networking administrators to assign IP blocks to projects. It is quite common for organizations to want to use a single 10/8 address space across the organization. Without an understanding of the IP address space requirements for GKE and knowledge that cluster address space cannot be changed once created, network administrators may routinely under-assign space to GKE clusters.

The math in this document should help you argue for more appropriately sized (and future-proof) blocks.

If a user-defined CIDR is not specified during the cluster creation, Google will allocate a /14 which provides 262,144 IPs and, using the math outlined above, a maximum number of cluster nodes of (262,144–4,096)/256 == 1,008.

The largest, user-defined GKE cluster is /8 which would consume a GCE network entirely and provides 16,777,216 IP addresses and (16,777,216–4,096)/256 == 65,520 nodes.

The smallest, user-defined GKE cluster is /19 and was outlined in the example above with only 16 nodes.

References

https://cloud.google.com/sdk/gcloud/reference/container/clusters/create
https://cloud.google.com/compute/docs/vpc/
https://tools.ietf.org/html/rfc1918
http://www.ipaddressguide.com/cidr

--

--