Scale Securely with Transit Gateway
AWS Transit Gateway is an under-appreciated service providing secure, scalable, and maintainable connectivity between distributed services and on-premise networks.
--
Supporting Scale
As software company offerings expand, so do their teams. When startups get acquired or go public, the engineering team grows into a full engineering organization. That engineering organization then needs to scale to support more customers and in the process, monolithic architectures make way to service-oriented designs.
Like the code they write, engineering organizations migrate from a single entity into a distributed collection of two-pizza teams. As with their micro-services, these teams need to talk to one another.
Engineers have lots of ways to communicate: email, Slack, JIRA, and passive-aggressive post-it notes. So, there are many options for connectivity in AWS including Internet Gateways, NAT Gateways, Direct Connect Gateways, Egress-Only Internet Gateways, VPC Peering, and PrivateLink.
Connectivity Options
Each option has a set of trade-offs associated with it, but the main ones used for service-to-service communication are Internet Gateways, VPC Peering, PrivateLink, and Transit Gateway. Here is a vastly oversimplified use case for each of these resources:
- Internet Gateways: use only for public services. Traffic is routed over the internet and DNS entries are configured in a public Route53 Hosted Zone, making your service discoverable and accessible to the world.
- PrivateLink: use for secure, single service-to-service integration with one of the supported services. Not ideal for shared services as each consumer network requires a VPC Endpoint.
- VPC Peering: use for single, VPC-to-VPC service sharing. Not ideal for mesh networks because peering connections are not transitive and Edge-to-Edge Gateway/ VPN Connections cannot be shared.
- Transit Gateway: use for general purpose, secure, centrally-managed shared services and internal, on-premise applications. Combine with private Hosted Zones and a Route53 Resolver to support private DNS.
Transit Gateway vs. VPC Peering — Limitations
As a network-level utility that enables the sharing of multiple resources, VPC Peering is the closest analog to Transit Gateway. As such, the two also share similar limitations including the inability to have overlapping IP address ranges (CIDR blocks).
However, there are several significant unsupported VPC Peering configurations that are fully supported by Transit Gateway.
VPC Peering does not support transitive peering
This restriction is so important, it is actually the reason behind the name Transit Gateway. Transitive networks greatly simplify full, multi-VPC mesh networks where every node is connected to every other node in the network.
If you have a VPC Peering connection between VPC A and VPC B, and one between VPC A and VPC C, there is no VPC Peering connection between VPC B and VPC C. You cannot route packets directly from VPC B to VPC C through VPC A.
While it is possible to connect up to 125 VPCs through peering connections, the issue is one of scale. For a full mesh network, there need to be n(n-1)/2
total connections. For 125 VPCs, that is more than 7,000 total peering connections! On the other hand, Transit Gateway supports up to 5,000 VPCs and only requires one Attachment per VPC.
VPC Peering does not allow sharing of Edge-to-Edge Gateways
Additionally, VPC Peering does not support the sharing of Edge-to-Edge Gateways or Private Connections including Internet Gateways, Direct Connect Gateways, or VPC Endpoints. That means each of these resources needs to be recreated across every network that needs access.
With Transit Gateway, only a single Site-to-Site VPN or Direct Connection is needed to host internal services that span multiple accounts. This configuration is much easier to monitor and scale.
Edge to edge routing is not supported; you cannot use VPC A to extend the peering relationship to exist between VPC B and the corporate network.
Internal & Shared Services
Although Transit Gateway enables many advanced network topologies, at a high-level, it is especially useful in provisioning internal & shared services.
Internal Services
Cloud-hosted internal services are valuable because they combine the scalability and cost-effectiveness of cloud infrastructure with the privacy, security, and control of on-premise systems.
With the proliferation of Software as a Service (SaaS), we have become accustomed with the notion of distributing information to third-party providers and creating services that are connected to the public internet. However, restricting access to such services at the network-level improves security by reducing the surface area for a potential attack and eliminates unnecessary traffic from bots.
Before Transit Gateway, in order to launch internal, cloud-hosted services each isolated VPC needed its own connection to a corporate network. This is difficult to monitor and maintain and is not cost-effective since many of these resources are priced by the hour. With Transit Gateway, it is much easier to monitor and restrict traffic across just a single connection.
Shared Services
Monolithic architecture is by its nature centralized. This means that communication between components and sharing dependencies is trivial. Yet, monoliths are an example of tightly-coupled software that has well-known issues related to scalability, security, and stability.
In direct contrast are distributed systems and service-oriented architectures (SAO). These inherently decoupled applications now come with new issues, like the ability to share cross-cutting concerns like logging, alerting, and analytics. That is where Transit Gateway can help.
Rather than share a Site-to-Site VPN that connects to a corporate network, it is instead possible to share routes to a set of VPCs that together make up a suite of shared services. This ensures traffic to these services is not routed over the internet and both ingress and egress can be monitored in a single location. It also means that as new services get created, they can become instantly and automatically available to other applications without changes in configuration or the need to create a new pairing connection.
Complexity in the cloud only grows and that is especially true of cloud networking. As recently as 2006, Amazon Web Services (AWS) had only three service offerings: S3, SQS, and EC2. Today there are hundreds of services. Some like Transit Gateway comes with an initial investment in time, but once set up, they greatly reduce the cost and difficulty of connectivity moving forward.