How we built a managed full mesh and transit VPC VPN for AWS
The story starts with the inter-region connect (IRC). The IRC is a secure, dedicated and highly available connection between Amazon Web Services (AWS) regions/VPCs with reliable performance, reserved bandwidth and baked-in failover. AWS while having an in-house solution for peering VPCs inside the same region, does not have one for VPC peering across regions.
One of the features of the IRC was security: the ability to deploy IPsec based VPN tunnels on top of the reserved bandwidth. These VPN connections were point to point VPN tunnels between AWS regions/VPCs.
However, Datapath.io’s philosophy has always been to listen to our customers. We repeatedly ran into scenarios where our customers were facing more intricate connectivity needs for their VPN setups on AWS. These included regular hub and spoke (transit VPC) and Full-mesh architectures as well as lesser known Full-mesh plus leaf VPC VPN architectures.
So, we decided to upgrade the IRC into a full blown managed VPN solution that integrates both a hub and spoke as well as a full mesh VPN architecture into VPC connections across regions.
VPN connectivity architectures
Setting up VPN connections on AWS gets progressively harder as the architecture scales and inter-connectivity needs become more complex. Managing and trouble-shooting these inter-region connections also becomes more complicated, time consuming and costly.
Inter-region VPN connections on AWS are usually arranged in traditional point to point, transit VPC (hub and spoke) or full mesh architectures. Another lesser known VPN architecture is one which integrates leaf VPCs, connected to the edge VPC, into the overall VPN setup.
Let’s take a moment to talk about the three inter-connectivity scenarios.
Download the Managed VPN Whitepaper.
Transit VPC — Hub and spoke:
A hub and spoke architecture consists of a central inter-connection point or hub which in turn connects to all the spoke nodes. All communication between the edge nodes goes through the central hub. One major benefit of a hub and spoke architecture is that it minimizes the number of connections required to connect multiple nodes making the network simpler to manage. It does, however result in a higher number of hops between edge VPCs.
On AWS, this architecture is usually referred to as transit VPC. The transit VPC allows AWS users to connect VPCs distributed across different regions and corporate networks in a single global network with a central VPC acting as the hub.
AWS’s in-house transit VPC solution is based on two Cisco CSR 1000v instances deployed into a central transit VPC.
AWS transit VPC
A full mesh architecture allows direct connections between all nodes of a network. On AWS this translates into direct connections between VPCs or regions without having to go through a central transit or hub VPC. Every VPC is connected to every other VPC. Setting up a full mesh VPN network on AWS can be quite complicated and requires much more involved management and monitoring.
Full Mesh VPN connectivity architecture on AWS
Hub and spoke plus Mull mesh:
This VPN connectivity scenario brings together both architectures. Edge nodes or VPCs are arranged in a full mesh architecture. Leaf VPCs, connected to the edge VPC are then connected through a hub and spoke VPN architecture.
Full Mesh plus transit VPC VPN connectivity on AWS
Datapath.io’s managed VPN
Datapath.io provides a super easy way of creating encrypted IPsec VPN tunnels between VPCs across AWS regions for all three inter-connectivity scenarios. It is fully managed, completely automated and highly available with no new hardware or software requirements.
The managed VPN solution leverages a cloudformation stack to spin up VPN appliances in user’s VPC. These appliances run as AWS instances which run the proprietary VPN software and handle the IPsec encryption. They auto discover all the VPN instances in all the other regions and generate configuration settings for routing protocols, VPNs, firewall, IPsec and signalling connections.
It natively supports all the necessary features to ensure a highly available and reliable VPN service including: self-healing instances, multi-AZ deployment and automatic AWS route table management for failover.
Let’s look at some of the features:
The VPN connections spun up by the cloud formation stack use IPsec network protocol suite to authenticate and encrypt the data.
Full-mesh VPN core
Datapath.io’s managed VPN solution comprises of a mull mesh VPN core. VPN tunnels between VPCs in different AWS regions are arranged in a full mesh architecture, where every VPC is connected to every other VPC though an IPsec tunnel. On the edges of this full mesh VPN architecture, leaf VPCs inside the same AWS region are then connected to the edge VPC, through a transit (hub and spoke) architecture.
The transit VPC feature at the edge also means that regions with multiple VPCs, do not need to have individual VPN instances assigned to each VPC. VPN instances are placed in the transit VPC and all the other VPCs can route their traffic over this VPC out through the VPN.
Self-healing properties (with multi-AZ deployment)
We also implemented multiple redundancies into the managed VPN solution. The cloud formation stack that sets up the VPN connections places the instances handling those connections into separate availability zones. These instances are arranged in an active passive setup. Health status checks initiated through auto scaling groups seamlessly bring up the passive instance whenever an AZ or instance fails, without affecting the VPN tunnel.
The managed VPN service can also be extended to reserve bandwidth for inter-region communication. We leverage DirectConnects with AWS and transit provider partners to be able to provide this functionality. Reserved bandwidth also introduces an additional layer of redundancy to the entire setup by allowing us to use the default AWS gateways for failover. This makes the connection more resilient and reliable and allows us to provide a very high availability level, even though AWS itself, does not provide an SLA for the DirectConnect
Cross account connectivity
The managed VPN solution also natively supports VPN tunnels between VPCs in different AWS accounts. This is important, since we frequently run into use cases where customers have VPCs distributed across accounts that they want to connect over VPN tunnels. Secure communication with partner VPCs is another aspect where cross account connectivity comes into play.
Diversified spot-instance launch groups
The managed VPN solution can also deploy spot instances to handle the VPN solution, in which case we run one instance per availability zone at the spot instance price.
AWS KMS key management
The mVPN solution also has inbuilt support for AWS KMS key management. The key management feature allows users to manage their own keys including key roll-over and key invalidation. It derives keys from the user’s KMS master key, without Datapath.io having any visibility into those keys.
1 Click setup for IPsec based VPN tunnels
Setting up IPsec based VPN tunnels between AWS regions is as easy as choosing the regions and clicking a button. The console automates the entire process, and provisions all required AWS resources including AWS virtual private gateways, instances and VPN tunnels.
Take a look at the screenshot below:
Provision IPsec VPN tunnel on AWS
Most other solutions require users to manually create and configure both AWS and VPN artifacts. These include but are not limited to creating AWS virtual private VPN gateways and creating and configuring the VPN tunnels.
If you want to learn more about the Managed VPN solution, download the Whitepaper.