GCP Routing Adventures (Vol. 1)
“I perfectly understand how GCP routing works!”. This is what I keep repeating myself every time I start a new network design.
While it’s not like starting every time from scratch, I always need to refresh some fundamental concepts, which are really key to properly master the topic, before being able to design things right.
After some long days experimenting (again!) with other teammates, I decided to write this article, so I can have a quick reference for myself, and something that can be -hopefully- useful for the broader audience.
Key questions we’ll try to go through in this article include:
- VPC routing basics
- Dynamic routing mode: global vs regional
- The effect of static and dynamic routes
- Route Priorities: base costs and cross-regional (region-to-region) penalties
- Multi-region deployments routing basics
VPCs, Cloud Routers and routing tables
Let’s start from some basic concepts:
- GCP Virtual Private Clouds (VPCs) are global virtual networks, spanning across all GCP regions. Resources in the same VPC network can communicate with each other, assuming firewall rules are in place.
- Within a VPC network, we can create regional subnets. Each subnet has an associated CIDR (not counting alias ranges). CIDRs are unique within subnets and VPC networks.
- Cloud Routers (CR) are regional resources that allow to exchange dynamic (BGP) routes between VPC networks or other third-party routers.
- Routes in VPC networks have a route priority associated. The route priority can be influenced by many factors that will be, many of which examined in this article. Priorities are often also referred as (route) costs, so we’ll use both terms moving forward.
A first, important misconception is to assume that each VPC network is associated with one, global routing table.
This is often due to how -graphically- routing entries are represented in the GCP console: indeed, as a bunch of entries sitting in a single table.
Turns out that this is not entirely true. Before digging more into it, we should pause here and understand what Dynamic Routing Mode is.
Dynamic Routing Mode
You might have noticed that VPC networks have a parameter called Dynamic Routing Mode, which can be either set while creating them or afterwards.
This parameter rules how dynamic routes (read BGP routes, whether they come from third-party network appliances, VPNs, interconnects) are exchanged from a region to another within the same VPC network, and how these routes get programmed in the underlying network.
Here are the options available and the effects they have on dynamic routes.
Within a VPC network, if dynamic routing mode is set to regional, Cloud Routers make the routes they learn, available only to the instances within their same region
On the other hand, if dynamic routing mode is set to global, Cloud Routers within the same VPC network make the routes they learn available to any instance (i.e. VM) in the VPC network, regardless of the region of the Cloud Router or the region of the instances
A pragmatic approach to routes and routing table(s)
So, how VPC networks routing tables can be global, if under certain conditions, Cloud Routers make routes available only to the instances in their same region?
As mentioned, it turns out this assumption is not completely true.
Over time, I built a mental model which helps me to rationalize how things work: I like to imagine that VPC networks have multiple regional routing tables.
When static routes (the ones we add manually) are created, they are programmed with the same priority in all the regional routing tables. As a result, all instances will be able to leverage those route, regardless where they are located.
For dynamic routes, it depends:
- if dynamic routing mode is regional, dynamic routes are programmed only in the regional routing table where the Cloud Router lives (so they will be effective on instances of that region only)
- if dynamic routing mode is set to global, dynamic routes are (ideally) stored in each regional routing table, so that those routes are available to any instance in the VPC.
What makes the global option even more interesting is the priority associated with the dynamic routes, as these get propagated. This is what we’ll focus on in the next paragraph.
Dynamic Routes priority and cross-regional penalty
While static routes are programmed in all regional routing tables with the same priority, this is not true for dynamic routes.
The mental rule I use in this case is:
Whenever a dynamic route is propagated from a region to another, a cross-regional penalty is added.
The cross-regional (XR) penalty (also known as region-to-region cost) is the extra cost “we pay” along the path to send traffic cross-region.
Think about it… Isn’t this logic? Going cross-region means having traffic traveling many more miles away, before reaching the destination.
It’s important that instances (i.e. VMs) and other routers are aware of this and, leveraging priorities (the total cost), consider that path longer.
Keep in mind that the cross-regional penalty is different between each couple of regions and it’s dynamic, meaning it subject to changes over time. Its value is between 201 and 9999 (inclusive).
The easiest way to dig more into this is to jump on a practical example, to see how Cloud Routers and VPC networks behave, as routing settings change.
First example: regional routing
Let’s start simple and let’s add more, as we move forward.
In the diagram we see two VPC networks (left and right) with dynamic routing mode set to regional.
Each VPC network has two subnets: the upper one configured in europe-west3 (ew3), and the bottom one, configured in europe-west4 (ew4).
The VPC networks are connected with HA VPN (and related Cloud Routers), setup between subnets in ew3. In this scenario, the left Cloud Router advertises the VPC network subnet routes to the right Cloud Router (and vice-versa).
Let’s observe the right VPC network and its routing tables, split by region. These are represented on the right side of the diagram.
The right Cloud Router is receiving only one route: 192.168.0.0/24, whose next-hop is the other end of the VPN tunnel.
This is advertised from left to right with a base cost of 100 (the default base priority in GCP).
It’s important to note that, since the dynamic routing mode is regional
- the left Cloud Router advertises only the subnet(s) from its region to the right Cloud Router
- the right VPC network routing table in ew4 is empty (the diagram doesn’t include local subnet routes), as the Cloud Router in ew3 is not propagating routes to other regions
- Instances in the right VPC network (ew4) don’t have any way to contact instances in left VPC network (ew3), since no routes are present in the regional routing table of the right VPC network (ew4 to contact instances in left VPC network— ew3)
Let’s add one more piece: let’s instruct the CR located in the left VPC to also advertise a custom route, 0.0.0.0/0, to the CR in the right VPC.
Notice that this new advertisement is received by the right Cloud Router as well, with the base cost of 100.
Let’s now add a second VPN tunnel in ew4 to see how things change.
The new tunnel installs a new entry in the right VPC network routing table (in ew4), so that its instances can potentially communicate with instances in the left VPC, within the same region. Note that instances in the right VPC network (ew4) can’t still reach instances in the left VPC network (ew3).
Second example: global routing
Let’s start the exercise from scratch, but this time let’s set the VPCs dynamic routing mode to global.
Looking again at the routing tables in right VPC network, we can see how the route propagation is different.
Given the dynamic routing mode is set to global, few things happened:
- The Cloud Router in the left VPC network (ew3) advertised routes from all regions of the VPC where it lives. In this case, ew4. This is why we find a second entry in the routing table of the right VPC network (ew3) for 192.168.128.0/24
- When the left VPC Cloud Router advertised routes from the other region (ew4) it added a cross-regional penalty to the base route cost. In this example, the cross-regional cost (at the moment of writing, between ew3 and ew4 is 208) which, summed to the base cost, makes 308. This is the priority associated to the entry 192.168.128.0/24, stored in the routing table of the right VPC network (ew3)
- The right VPC network router hasn’t just programmed routes in the routing table where the router lives, but it also programmed routes in the routing table of ew4
- While programming routes in other regions (specifically ew4) the Cloud Router in the right VPC network adds an extra inter-regional cost to its routes (yes, this is the second time it’s done for 192.168.128.0/24!). Observe the routing table of VPC right — ew4: in respect of the same entries in the same VPC in ew3 the priority increased to 308
What happens if we instruct the Cloud Router in the left VPC network to send (besides subnet routes) the custom advertisement 0.0.0.0/0?
As it happened in the regional example above, the custom route is advertised to the right router. No surprises. But again, given dynamic routing mode is set to global, the custom route is programmed as well in the right VPC network (ew4) with an additional cross-regional penalty of 208.
Notice that the 0.0.0.0/0 route has instead not been programmed in the left VPC network (ew4), as this is not a subnet route. We simply instructed the left Cloud Router to advertise that extra route as a custom advertisement towards the right VPC network.
Let’s make things even more interesting: let’s add the second VPN between the two VPC networks in the ew4 region.
Looking at the routing tables again, the new VPN created a new set of entries that may not be easy to understand upfront.
- the ew3 routing table has the new entry 192.168.128.0/24 with next-hop tun2 (ew4), with priority 308.
This route has been originally learned from the Cloud Router in the left VPC network (ew4), which didn’t add any extra inter-regional cost, being the route from the same region where the Cloud Router lives. The Cloud Router in the right VPC network (ew4) programs the route in ew3 as well, adding the cross-regional penalty.
Symmetrically…
- the ew4 routing table has the new entry 192.168.0.0/24 with next-hop tun1 (ew3), with priority 308.
This route has been originally learned from the left VPC network (ew3) Cloud Router, which didn’t add any extra inter-regional cost, being the route from the same region where the Cloud Router lives. The Cloud Router in the right VPC network (ew3) programs the route in ew4 as well, adding the cross-regional penalty.
Notice that
- Routes for the same prefix and with the same priority, are all maintained in the routing table (read more on ECMP, below)
- If more routes for the same prefix are learned with different resulting priorities, only the one(s) with the lowest priority remains visible to you. Other routes will appear again, as soon as the one with the lowest priority is withdrawn
Indeed, the following routes have not been programmed:
europe-west3
192.168.0.0/24, next-hop tun2 (ew4), priority 516
europe-west4
192.168.128.0/24, next-hop tun1 (ew3), priority 516
This is because the regional routing tables have already entries with lower priority for the same destination (100).
Where are my routes?
Alright, I want to replicate what Luca just showed me! You bring up the testbed, you go to the GCP console and you don’t see all the routes I showed you before. Ups, surprise! Not all those routes are visible from the Google Cloud Console (the GUI).
What you won’t see are the dynamic cross-regional routes, programmed by Cloud Routers in the destination VPC network. Referring to the examples above, the ones we saw in the last diagram, when we added the second VPN tunnel in ew4.
To see those, you’ll need to use the gcloud CLI:
gcloud compute routers get-status ROUTER_NAME --region=REGION --project PROJECT_ID
The output will show you
- What routes the Cloud Router is advertising (section bgpPeerStatus/advertisedRoutes)
- What routes the Cloud Router learned from others (section bestRoutesForRouter)
- What routes are programmed in the regional routing table, including the ones you miss in the GCP console (section bestRoutes)
Paths enforcement
In the last example, we see how traffic can follow multiple paths to reach the same destination, given how routes are setup.
For example, if an instance in the right VPC network (ew4) wants to reach another instance in the left VPC network (ew3), what path it will take?
Taking a look at the right VPC network (ew4) routing table we can see there are two entries for 192.168.0.0/24:
- next-hop tun1 (ew3) / priority 308
- next-hop tun2 (ew4) / priority 308
This means that, whilst the two tunnels are up, Equal Cost Multi Path (ECMP) will be used to send traffic to destination. If one of the two tunnels goes down all the traffic will pass through the one that remained up.
What if we want to force traffic to go through a specific path, even if both tunnels are up? We’ll need to act on the base cost, so that we change the final route priority (in the “destination” regional routing table).
Let’s say we want the flow of traffic to go through tun1 (ew3) all time, even when the VPN in ew4 is up. From the right VPC network point of view, this means we’ll always go cross-region first and we’ll then traverse the VPN, via tun1 (ew3).
To do so, we’ll need to advertise routes from the CR in the left VPC network (ew4) with a higher base priority. How much higher?
Adding 1 point to the base cost is enough to make the route with next-hop tun1 (ew3) will be preferred to the one going through tun2 (ew4).
Giving the dynamic nature of cross-regional penalties (varying from 201 to 9999), setting delta higher than just 1 would be definitely a better approach. For example, 10000 would guarantee that the route you are MED changing the value to, would always be considered with less priority than any other route automatically propagated by GCP for the same prefix.
Our route would have a cost of
base cost + delta = 100 + 10000 = 10100
The GCP generated route (with the max cross-regional penalty possible) would have a cost of
base cost + cross-region penalty = 100 + 9999 = 10099
Conclusions
As people say, devil is always in the details. As you can see, what may seem simple at high level, hides quite lots of technical details, which should be carefully evaluated, in order to properly design complex network infrastructure.
Hopefully, you enjoyed the read and this intro to GCP routing has been useful for you, as it will be for me, every time I’ll start questioning (again) if I really know my GCP networking stuff.
As always, keep the official documentation monitored for the latest updates, and -most of all- have fun with GCP!