Next-Generation Networking with AWS Transit Gateway and Shared VPCs

Published in

Slalom Technology

11 min readJan 7, 2019

AWS’s annual conference, re:Invent, sets the stage for product releases and announcements that continually push the boundaries of what’s possible on their platform. This year’s event was no exception, with a staggering 147 product releases in a number of trending technology areas such as Machine Learning, Blockchain, IoT, and Serverless Computing. However, it was announcements in the networking space with the release of AWS Transit Gateway and the introduction of Shared VPCs that particularly caught my attention. AWS Transit Gateway is a truly game-changing technology solution as it provides a central hub for connecting AWS VPCs and VPN connections, while Shared VPC eases the pain of managing multiple VPCs by allowing you to centrally manage and distribute them to accounts in your organization.

Considering the array of cutting-edge services at your fingertips within the AWS platform, it’s fairly easy to understand why a core topic such as network architecture often draws less of a crowd.

But answer me this: Would you build a new house without first knowing that the foundations were solid?

While I am amazed by AWS and their continued product releases in areas such as IoT, real-time data streaming and now even Satellite communication, I remain passionate about fundamental content such as multi-account strategies, governance at scale, and network architectures in the cloud. As a solutions architect, these are the foundations that pave the way for our clients to be successful in their respective journeys to the cloud with AWS.

A few weeks since their release, I’ve been able to gain some hands-on experience with both Transit Gateway and Shared VPCs. I’ll start by discussing the historical challenges of networking at scale in AWS, before outlining each service and discussing how I believe their combination will alter the design of foundational network architectures in AWS going forward.

The Challenges of Networking at Scale in AWS

As organizations have moved to AWS in recent years, Virtual Private Clouds (VPCs) have been identified as an ideal method for separating workloads from one another when necessary. As a refresher, VPC allows you to provision a logically isolated section of the AWS cloud where you can launch resources in a virtual network that you define. In a scenario where resources within two separate VPCs require the ability to communicate, a VPC peering connection is created to enable this. In addition to communication between VPCs, most organizations typically also have a requirement for hybrid connectivity, utilizing VPNs or AWS Direct Connect to connect VPCs to On-Premise networks. At a small scale, involving only a few VPCs, connectivity is “manageable” at best. Consider the following example, with 5 VPCs that we would like to connect via peering in addition to On-Premise via VPN:

**Image 1:** Counting the number of connections required to connect VPCs to each other and On-Premise

As you can see, 5 interconnected VPCs which are also connected to an On-Premise location require 10 distinct VPC peering connections and 5 VPN connections. For true redundancy, there would also be an additional 5 VPN connections to a second Customer Gateway (CGW) On-Premise.

**Image 2:** Growth of peering connections required to connect VPCs in a full-mesh network

Image 2 shows how the number of peering connections required to connect VPCs in a full-mesh network grows exponentially as the number of VPCs increases. The scenario illustrated in Image 1 is slightly far-fetched as it’s unlikely that all VPCs will require a peering connection in a real-world solution. However, it helps to show how quickly the number of connections can grow given VPC peering requirements.

In addition to VPC peering connection requirements, as the number of VPCs owned by an organization has grown from tens to hundreds to thousands, the creation and management of connections from VPCs to On-Premise infrastructure has proven to be a major challenge. To tackle this, AWS introduced a Transit VPC solution in mid-2016, as shown in Image 3.

**Image 3:** A sample Transit VPC solution and associated connectivity between VPCs

A Transit VPC solution uses host-based VPN appliances in a dedicated VPC to perform transitive routing between networks through a central hub. The introduction of additional VPCs only requires new VPN connections between the host devices and the VPC, rather than additional connections to the On-Premise CGW. Although reducing the number of On-Premise connections, dedicated host appliances add additional cost and management overhead. Furthermore, this solution doesn’t help to reduce the number of VPC peering connections required.

AWS Transit Gateway

Considering the challenges discussed in the prior section, the release of AWS Transit Gateway at re:Invent 2018 was an exciting development. Utilizing Transit Gateway, you only need to create and manage a single connection, called a Transit Gateway Attachment, between the Gateway and each Amazon VPC or On-Premise location. Transit Gateway maintains its own route table, separate from the route tables associated with subnets within individual VPCs.

**Image 4:** High-Level Overview of Transit Gateway, Transit Gateway Attachments, and Transit Gateway Route Table

**Image 5:** Example Subnet Route Table for a subnet in VPC 1

AWS Transit Gateway removes the need to configure peering connections between VPCs that need to communicate. Instead, each individual VPC is associated with the Transit Gateway using a Transit Gateway Attachment, as shown in Image 4. The Transit Gateway Routing Table (also shown in Image 4) contains a complete list of all VPCs and VPNs associated with the Transit Gateway and their respective Transit Gateway Attachments. Within the routing tables associated with a particular VPC subnet (example shown in Image 5), traffic destined for another VPC’s CIDR range is simply directed towards the source VPC’s Transit Gateway Attachment. Once traffic reaches the Transit Gateway via that attachment, the Transit Gateway Route Table is used to determine which attachment to use to send the traffic to its final destination.

In addition to making it easier to interconnect VPCs, AWS Transit Gateway removes the cross availability-zone data charges that exist when utilizing VPC peering connections. Instead, AWS Transit Gateway charges a flat fee per Transit Gateway attachment and then per GB of data that flows through the Gateway, regardless of source and destination. Information on Transit Gateway pricing can be found here.

Serving as a central hub for all network connectivity, Transit Gateway also replaces the requirement for a Transit VPC solution (shown previously in Image 3) in most circumstances. Instead, VPN connections are associated with the Transit Gateway via a Transit Gateway Attachment in the same way that VPCs are. With this, traffic from On-Premise networks can be directed to any other network attached to the Transit Gateway as long as route table entries allow it to do so. Although VPN traffic is limited to a 1.25 Gbps bandwidth per VPN tunnel, Transit Gateway includes Equal Path Multi-Cost (ECMP) routing support. Assuming the other end of the VPN connection supports ECMP, traffic can be equally distributed between any number of VPN connections to scale the effective bandwidth.

Although Transit Gateway greatly simplifies the management of connections from AWS to On-Premise networks, organizations may prefer to utilize a Transit VPC solution if they require additional monitoring and visibility or further security features, such as:

Outbound URL filtering
Firewall devices
Intrusion Detection and Prevention (IDP)
Unified Threat Management (UTM)

In such cases, Transit Gateway and Transit VPC solutions can be used together to achieve these goals.

AWS Transit Gateway is a welcome release that solves a number of networking related challenges. However, additional functionality is expected in 2019 that will further increase its value for organizations of all sizes. At present, hybrid connectivity between AWS Transit Gateway and On-Premise networks can only be established via VPN connections. Although Transit Gateway supports VPN connections with ECMP enabled, support for AWS Direct Connect is slated for release in Q1 of 2019. This will allow for multi-Gbps connections in and out of the Transit Gateway via a single connection. In addition to added support for AWS Direct Connect, AWS have indicated that they will soon make it possible to connect Transit Gateways in separate AWS Regions via the release of Transit Gateway Peering. This will allow organizations to build globally distributed networks with minimal effort.

Shared VPCs

The release of Virtual Private Cloud (VPC) functionality back in 2009 was a first step towards the logical isolation of workloads in AWS. Many organizations have migrated workloads to AWS in the years since, often creating hundreds to thousands of VPCs to support a wide variety of use cases. As discussed when outlining the challenges of networking at scale in AWS, the management of this many VPCs can take significant time and effort.

AWS pride themselves as being a customer-focused organization, with around 95% of all product releases coming as a result of user feedback. In this case, persistent requests from customers challenged AWS to make the management of VPCs at scale easier. Unveiled at re:Invent 2018, their solution was to introduce Shared VPCs. Simply put, VPC sharing allows for many AWS accounts to create their resources within a centrally managed VPC. The AWS account that creates and owns the VPC can choose to share particular subnets with other AWS accounts within the same AWS Organization. Once a particular subnet is shared with an account, it can then create, view and modify resources it owns within those particular subnets.

**Image 6:** Sharing a VPC with multiple AWS Accounts using Resource Shares

Looking at Image 6, imagine you want to create a VPC that you can share with all of the development level accounts for a particular business unit. If you allocate the largest VPC CIDR block possible, 16-bits, this provides you with 65,536 IP addresses that can be grouped and shared using subnet sizes of your choice. For example, you could create 256 subnets of size /24 (each with 256 IP addresses) or 1024 subnets of size /26 (each with 64 IP addresses). Once you have defined your subnets, you can then share any of them with any account within your AWS Organization using a Resource Share. To implement a granular level of segmentation, subnet NACLs can be used to fence off access between specific subnets, ports or destinations.

While it’s true that utilizing Shared VPCs can reduce the number of VPCs and overall management burden, it’s important to realize that it’s not a one-size-fits-all solution. Significant thought needs to be put into how many VPCs to utilize and how to share their subnets between accounts.

For example, some questions to ask include:

At what level will VPCs be defined? — Per line of business? Per team? Per project?
Which VPCs should route traffic to one another? Should certain subnets have dedicated route tables?
Which VPC subnets require strict traffic flow management using subnet NACLs?

Working together as an organization to answer questions such as these and define all aspects of network management is extremely important.

Network Architectures Combining AWS Transit Gateway and Shared VPCs

Having provided an overview of AWS Transit Gateway and Shared VPCs, it should be evident that each of them tackles a different challenge in the management of networking in the cloud. AWS Transit Gateway eases the burden of managing connectivity both between VPCs and from VPCs to On-Premise networks, whereas Shared VPC helps to reduce the number of VPCs by offering a method to centrally manage and distribute them.

With these capabilities at hand, using them in combination will help define the next generation of cloud network architectures in AWS. Consider Image 7, which is an architecture I built in AWS with very little effort. Within my AWS Organization, I have a master billing account, a logging account, an account for hosting networking infrastructure, a shared services account, two development, and two production level application accounts. In lieu of an On-Premise datacenter, I utilized Google Cloud Platform (GCP) to test hybrid connectivity over VPN. Steps outlining the setup of VPN between AWS Transit Gateway and GCP are beyond the scope of this article, but I found GCP documentation here to be particularly useful.

**Image 7:** An example AWS Network Architecture Utilizing Transit Gateway and Shared VPCs

First, let’s examine the use of Shared VPCs within this setup. From Image 7, notice that there are 3 VPCs defined: Dev, Prod, and Shared Services. These VPCs are defined within an AWS account dedicated for centralized management of network infrastructure. The Dev and Prod VPCs are shared with environment specific application accounts, while the Shared Services VPC is shared with a single account. As discussed earlier, the Sharing of VPC subnets is done via the AWS Resource Access Manager. Image 8 shows one of the Resource Shares I created to share subnets 1, 2, 5 & 6 from the Dev VPC with Dev Account 1. The “Shared resources” section outlines which subnets have been shared, while the “Shared principals” section defines which accounts they have been shared with.

**Image 8:** Resource Share to share subnets from Dev VPC to Dev Account 1

Each principal account can only see VPC subnets that have been shared with it. In addition, even if a single subnet is shared with multiple accounts, each account can only see the resources that they own within it.

Looking again at Image 7, AWS Transit Gateway is the central point of all connectivity within the architecture. Each of the VPCs and the VPN from GCP is connected to the Transit Gateway via a Transit Gateway Attachment. In addition, each also has an entry in the Transit Gateway Route Table listing the appropriate Gateway Attachment to use to send traffic to it. Subnet route tables can be used to dictate connectivity from that subnet to other networks. For example, Image 9 and Image 10 show the route tables for private subnets from the Dev and Shared Services VPCs respectively.

**Image 9:** Dev VPC Private Subnet Route Table

**Image 10:** Shared Services VPC Subnet Route Table

The Shared Services VPC has routes to connect to both the Development and Production VPCs, but not the network in GCP (see Image 10), while the Development VPC has routes to the Shared Services VPC and the network in GCP, but not the Production VPC (see Image 9). Setting the source VPC’s Transit Gateway Attachment as the next hop destination is all that is required to ensure that traffic is directed first to the Gateway and then on to the final network destination.

Looking Forward

By introducing AWS Transit Gateway and Shared VPCs, AWS has again listened to customer feedback to tackle some of the major problems in network management and connectivity at scale. As demonstrated, utilizing Transit Gateway and Shared VPCs within a network architecture makes it straightforward to build a manageable, highly-connected cloud network. Looking forward, additional releases in the form of Direct Connect support and global peering for Transit Gateway will greatly increase its value for organizations of all sizes. Before long, I have no doubt that both of these releases will play a pivotal role in the next-generation of globally scalable cloud network architectures in AWS.