High-Availability VPN on AWS with Strongswan

For all code used here, visit GitHub

Recently i had some fun trying to implement “Strongswan” software-based VPN solution on AWS to seamlessly integrate with multiple partners over a simple IPSec VPN.

AWS VPN vs Strongswan

Normally, i would suggest using AWS provided VPN solution, but in case where you need to integrate with multiple partners over VPN and they all have different requirements (IKEv1, IKEv2, Ciphers, PSK etc ) you are going to end up looking for a self-hosted software based VPN solution.

You could go with something enterprise like Cisco Cloud Routers, but not every company can afford buying licenses so your best shot is something open source like “Strongswan”.

It’s also cheaper if you are willing to manage VPN on your own. AWS VPN costs around 50$ a month for each connection. With a self-hosted VPN you will be paying only for EC2s so you could run your VPN on 5$ t3.nanos.

I do not recommend to run it on t3.nano’s though, VPNs can be heavy on CPU. In production environment i recommended going for c5 instance types as they are cheapest instances to offer 10gb/s network and have better CPUs.

Running Strongswan on AWS

All above mentioned scripts and resources can be found here

Setup

For OS i decided to go with CentOS as it’s the most commonly used OS for software VPNs.

There are however some caveats with running CentOS on AWS in comparison with AWS Linux instances. Biggest one is probably having to configure any new ENIs manually. AWS Linux handles them out of the box.

Here are some most interesting lines from our Userdata script, you can access full file on GitHub:

1. Strongswan
wget http://download.strongswan.org/strongswan-5.7.1.tar.bz2
...
./configure --prefix=/usr --sysconfdir=/etc --enable-sql
2. Disable SELinux
sed -i 's/^SELINUX=.*$/SELINUX=disabled/g' /etc/selinux/config
shutdown -r now
  1. As you can see, we are installing “Strongswan” from source and not from official Yum repository. Reason i did so, is because i wanted to have a possibility of installing different “Strongswan” modules that don’t come with yum repo version, such as sql. Also, the yum repo “Strongswan” for CentOS for some reason renames ipsec CLI tool to strongswan , which makes it annoyingly long to type.
  2. Disabling SELinux and rebooting to make it permanent. Was blocking keepalived from running custom scripts. Something that i decided not to configure and use at all.

Cloudformation resources and full userdata script have much more resources not mentioned here which you might not need, such as:

  1. AWS CLI installation
  2. AWS SSM agent installation ( Will be used for dynamic vpn configuration )
  3. Network setup for given EC2 to act as NAT/VPN
  4. Disabling source/destination check on AWS side to let EC2 instance forward packets

Dynamic Configuration

If you have followed everything mentioned above, at this point you should have an EC2 server with basic networking done and “Strongswan” installed.

You could now go ahead and manually configure each VPN connection, but i wanted to have my configuration be more “dynamic” aka configured without having the need to SSH into EC2 machine every time. I also wanted my configuration and secrets to be backed up in case EC2 Instance dies.

There are multiple options/plugins from Strongswan that would allow you to do Dynamic configuration, such as sql and mysql plugins. My personal issue with given plugins is that their configuration looks complex and awkward. Most of the data in sql tables needed to be hashed, even IPs, making it complicated to view and configure.

I decided to go for something much simpler — AWS Systems Manager.

I’ve been using SSM ( previously called as Simple Systems Manager, SSM abbreviation still kept ) since it’s release under EC2 Console. It’s now gone a long path into it’s own UI/Dashboard in AWS Console and added much more configuration options.

All you need to use SSM is an SSM agent installed on your EC2 instance and voila, through SSM UI you can now do all kind of magic with that instance, even manage all SSH sessions without having to manage keys and users. (Session Manager)

In my case i only wanted to use it for 2 basic things:

  1. SSM Document to configure custom AMI for my “Strongswan” instance ( installation from scratch on CentOS 7 is long, i wanted my instances to be bootable and ready in seconds )
  2. SSM Automation to run hourly, sync VPN files from S3 bucket to an Instance, check if files changed, update them and reload VPN configuration

High Availability — Keepalived

By now we have installed Strongswan and figured out some sort of dynamic configuration through SSM.

The only thing to figure out at this point was High Availability. Strongswan has an option to run in sort of “clustered” mode, however that requires ability to have control over private ip space, which you cannot do in AWS unfortunately. ( Strongswan HA )

I’ve decided to go for something that could be easily implemented on AWS.

I could go with simply reassigning Elastic IP to a slave instance upon failure, but i liked the idea of all VPN connections going out through ENI with same MAC address as well as master VPN server always having same private ip, just like you would do in local DC.

Long story short, i’ve created an ENI with selected private ip and EIP assigned to it. That ENI is initially attached to Master instance.

Both instances run Keepalived with slave having smaller priority and upon state change to master, both instances run similar script, that detaches ENI from previous instances, assigns to itself, sets up networking and re-established VPN connections.

That takes a minute or two, but during Master instance death, we will be able to have all VPN links re-established without having to manually do anything.

That’s all folks, for any questions/comments/advices comment here or reach me on social networks :)