In this article, you’ll find a simple guide explaining how to build a centralised, secure Internet access (egress-filtering) solution using AWS Network Firewall in a multi-account AWS environment. I’ll describe why you’d want an egress-filtering solution, a potential architecture, and how to build it, plus a few lessons and thoughts on AWS Network Firewall after building this solution myself.
1. What is Egress-Filtering?
In simple terms, egress-filtering involves controlling and monitoring network traffic that flows out from your networks to the Internet. Most people understand the importance of controlling connectivity from the Internet into their networks (through snazzy next-gen firewalls). Yet, they often don’t realise the risk that exists from inadequate control of outbound connectivity.
Properly implemented egress-filtering can reduce the risk of data exfiltration, disrupt malware command and control traffic, prevent consumption of unwanted or unknown services and stop your systems from being used to attack others if compromised.
Egress-filtering is an important security control. In my view, it’s just as important as, or, at the risk of being a little controversial, maybe more important than ingress-filtering. If you’re still a unsure on the concept of egress-filtering, plenty of great content describes it in more detail.
2. The Architecture
So, let’s look at how you might architect an egress-filtering solution using AWS Network Firewall. In the diagram below, you’ll see a simple architecture that provides Internet connectivity for a pair of isolated “application” VPCs through Transit Gateway and AWS Network Firewall. This is the architecture that we will build, step by step, in the next section.
This architecture provides centralised egress-filtering for multiple application / workload VPCs without much complexity. There are plenty of other AWS Network Firewall deployment models, so before you start building, you should read this great AWS blog post to see if another model might be more suitable.
The actual filtering is performed by AWS Network Firewall, which is a highly scalable and resilient AWS-native firewall service that provides a range of features. The key capability we need from AWS Network Firewall for egress-filtering comes from Stateful Rule Groups, and more specifically the ability to control outbound traffic using a FQDN / domain allow-list.
Before going any further, I need to stress that AWS Network Firewall is quite different to a traditional, run of the mill firewall. There are a few gotchas that you need to be aware of, which I will cover in detail at the end of this article.
3. Building It
Alright, let’s get down to building it. The steps below cover everything required to build the solution in a new / fresh environment. Existing environments that already have VPCs and Transit Gateways can likely skip a few of these steps. For brevity, I’m not going to cover Transit Gateway or VPC (including subnets / route tables) concepts in any more detail than is necessary to build the solution.
You will need AWS console and CLI access to your account(s) and appropriate IAM privileges. Let’s go:
Step 1 — Create “application” VPC(s) and subnet(s) for each AZ you want to use, remembering to create separate TGW attachment subnets. You don’t need to create any special route tables for these subnets.
Step 2 — Create your “firewall” VPC and subnet(s) for each AZ you want to use. You will need 3 subnets for each AZ: (1) a TGW attachment subnet, (2) a firewall endpoint subnet and (3) a public subnet. Each subnet will need its own dedicated route table. Associate each route table with the appropriate subnet.
Step 3 — Create and attach an IGW to your “firewall” VPC. Then create one or more NAT Gateways in the public subnet(s) within your “firewall” VPC. Don’t worry about configuring any route tables at this stage, we’ll do it later.
Step 4 — Create a Transit Gateway, making sure to choose an appropriate ASN for your network. Create a “firewall” TGW route table, and optionally one or more “application” TGW route tables if required. In my case, I’ve just used the default TGW route table and renamed it to “km-app-rt”.
Step 5 — Create a TGW attachment for each “application” VPC and the “firewall” VPC, making sure that you select the right TGW attachment subnets and not your workload subnets. Once the attachments are available, take note of the Transit Gateway attachment ID for your “firewall” attachment.
Step 6 — Switch back to your TGW route tables and navigate to your “application” route table(s). Delete the “firewall” association and propagation from the respective tabs. You should only be able to see your “application” attachments in associations and propagations at this point. Finally, switch across to your route tab and add a static default route (0.0.0.0/0) pointing to the “firewall” attachment.
Step 7 — Switch across to your “firewall” route table and create an association for just your “firewall” attachment. Then go to propagations and create a propagation for each “application” attachment. You don’t need to create a propagation for your “firewall” attachment.
Step 8 — It’s time to create our firewall. Navigate to AWS Network Firewalls — Firewall and click Create Firewall. Give it a name, choose your “firewall” VPC, the AZs you want to use, and make sure you select your firewall endpoint subnet(s). Finally, select “Create and associate an empty firewall policy”, enter a name for your firewall policy and hit the create button.
Step 9 — Pop into the Firewall details section and take note of your Firewall Endpoint ID(s), these will be used in our VPC route tables later. Next, we’ll enable logging by clicking Edit on the Logging section, enabling both Alert and Flow logging and configuring destinations, which in my case was a CloudWatch Log Group called “km-fw-log”.
Step 10 — Ok we’re nearly there, but take a deep breath, it’s about to get a touch more complex. Browse down in the Associated firewall policy rule groups to “Stateful Rule Groups” and click on Add rule groups -> Create and add new stateful rule group. Give your group a name, I’ve used “egress-filter”, set capacity to 1000, select the “Suricata compatible IPS rules” (I’ll explain the logic behind this a bit later) option and paste in the following rules:
# Amazon domainspass http $HOME_NET any -> $EXTERNAL_NET 80 (http.host; dotprefix; content:”.amazonaws.com”; endswith; msg:”Pass HTTP to .amazonaws.com”; sid:1001; priority:10; rev:1;)pass tls $HOME_NET any -> $EXTERNAL_NET 443 (tls.sni; dotprefix; content:”.amazonaws.com”; endswith; msg:”Pass TLS to .amazonaws.com”; sid:1002; priority:20; rev:1;)# Microsoft domainspass http $HOME_NET any -> $EXTERNAL_NET 80 (http.host; dotprefix; content:”.microsoft.com”; endswith; msg:”Pass HTTP to .microsoft.com”; sid:1003; priority:30; rev:1;)pass tls $HOME_NET any -> $EXTERNAL_NET 443 (tls.sni; dotprefix; content:”.microsoft.com”; endswith; msg:”Pass TLS to .microsoft.com”; sid:1004; priority:40; rev:1;)pass http $HOME_NET any -> $EXTERNAL_NET 80 (http.host; dotprefix; content:”.windowsupdate.com”; endswith; msg:”Pass HTTP to .windowsupdate.com”; sid:1005; priority:50; rev:1;)pass tls $HOME_NET any -> $EXTERNAL_NET 443 (tls.sni; dotprefix; content:”.windowsupdate.com”; endswith; msg:”Pass TLS to .windowsupdate.com”; sid:1006; priority:60; rev:1;)# Drop other trafficdrop tcp $HOME_NET any -> $EXTERNAL_NET ![80,443] (msg:”Drop any TCP traffic not on port 80/443"; sid:2001; priority:10; rev:1;)drop tcp $HOME_NET any -> $EXTERNAL_NET [80,443] (msg:”Drop any non-HTTP/TLS traffic on TCP 80/443"; flow:established; sid:2002; priority:20; rev:1;)drop ip $HOME_NET any <> $EXTERNAL_NET any (msg:”Drop all other non-TCP traffic”; ip_proto:!TCP; sid:2003; priority:30; rev:1;)
Step 11 — Fire up your AWS CLI, connect to the account containing your firewall and run the following command (substituting the — rule-group-name value with whatever you named your rule group earlier and fixing up the region if you’re not an Aussie) and copy the output into a text editor:
aws network-firewall describe-rule-group — type STATEFUL — rule-group-name egress-filter — region ap-southeast-2
Step 12 — At this point, we need to tell AWS Network Firewall what IP ranges exist within our network to make application (we need HTTP / TLS) decoders to work. We do this by updating our rule group with an explicit HOME_NET variable (described in more detail here).
We basically need to strip out everything except the RuleSource section, then stick a RuleVariables section on top that defines our internal network ranges (it must include both your “application” and “firewall” VPC CIDR ranges), once complete save it out to a file named “variables.json”:
"RulesString": "# Amazon domains\npass http $HOME_NET any -> $EXTERNAL_NET 80 (http.host; dotprefix; content:\".amazonaws.com\"; endswith; msg:\"Pass HTTP to .amazonaws.com\"; sid:1001; priority:10; rev:1;)\npass tls $HOME_NET any -> $EXTERNAL_NET 443 (tls.sni; dotprefix; content:\".amazonaws.com\"; endswith; msg:\"Pass TLS to .amazonaws.com\"; sid:1002; priority:20; rev:1;)\n\n# Microsoft domains\npass http $HOME_NET any -> $EXTERNAL_NET 80 (http.host; dotprefix; content:\".microsoft.com\"; endswith; msg:\"Pass HTTP to .microsoft.com\"; sid:1003; priority:30; rev:1;)\npass tls $HOME_NET any -> $EXTERNAL_NET 443 (tls.sni; dotprefix; content:\".microsoft.com\"; endswith; msg:\"Pass TLS to .microsoft.com\"; sid:1004; priority:40; rev:1;)\npass http $HOME_NET any -> $EXTERNAL_NET 80 (http.host; dotprefix; content:\".windowsupdate.com\"; endswith; msg:\"Pass HTTP to .windowsupdate.com\"; sid:1005; priority:50; rev:1;)\npass tls $HOME_NET any -> $EXTERNAL_NET 443 (tls.sni; dotprefix; content:\".windowsupdate.com\"; endswith; msg:\"Pass TLS to .windowsupdate.com\"; sid:1006; priority:60; rev:1;)\n\n# Drop other traffic\ndrop tcp $HOME_NET any -> $EXTERNAL_NET ![80,443] (msg:\"Drop any TCP traffic not on port 80/443\"; sid:2001; priority:10; rev:1;)\ndrop tcp $HOME_NET any -> $EXTERNAL_NET [80,443] (msg:\"Drop any non-HTTP/TLS traffic on TCP 80/443\"; flow:established; sid:2002; priority:20; rev:1;)\ndrop ip $HOME_NET any <> $EXTERNAL_NET any (msg:\"Drop all other non-TCP traffic\"; ip_proto:!TCP; sid:2003; priority:30; rev:1;)"
Step 13 — Back to AWS CLI now, we need to update the rule group using the “variables.json” file that we just created. You will also need the “UpdateToken” and “RuleGroupArn” from the output of the original “describe-rule-group” command we ran earlier:
Run the following command to update your rule group:
aws network-firewall update-rule-group — rule-group-arn arn:aws:network-firewall:ap-southeast-2:747843067444:stateful-rulegroup/egress-filter — update-token 609b2ae1-cb58–45bb-bf21–2651c804fd40 — rule-group file://variables.json — region ap-southeast-2
If the update was successful you will see the following output:
Step 14 — Ok, phew, that’s the hard part over, remember we’ll talk about what we just did a bit later, but first, let’s finish the build. Switch over to your VPC route tables and open your “firewall” public route table(s). Add a default route (0.0.0.0/0) pointing to the IGW and another route for your “application” VPC CIDR ranges pointing to the Firewall Endpoint IDs you capture earlier. Make sure you align your route table(s) and firewall endpoint(s) in each AZ!
Step 15 — Now switch to your “firewall” firewall endpoint route table, and create a default route (0.0.0.0/0) pointing to your NAT Gateways (making sure to align AZs) and another route for your “application” VPC CIDR ranges pointing to your Transit Gateway.
Step 16 — Now switch to your “firewall” transit gateway attachment route table and create a default route (0.0.0.0/0) pointing again to your Firewall Endpoint IDs (aligning to AZs again).
Step 17 — Ok, this is the last step before traffic will start flowing to the Internet. Navigate over to your “application” route table(s) and add a default route (0.0.0.0/0) pointing to your Transit Gateway.
Step 18 — At this point, traffic should be flowing to the domains we allowed in our Suricata IPS rules and all other traffic should be blocked. You should test and verify that this is the case, using your firewall logs in CloudWatch Logs to support you.
4. Wrapping Up
I mentioned earlier that AWS Network Firewall is not your typical firewall. It’s just a hunch, but I think the reason AWS says “AWS Network Firewall enables customers to run Suricata-compatible rules” is because Stateful Rule Groups actually use the Suricata engine under the hood.
If you’re not aware of Suricata, it’s an awesome Intrusion Prevention System “IPS” that does a great job of identifying and acting upon specific traffic. What Suricata is not, is a firewall — this isn’t to say it can’t control traffic, it’s just that it was designed to be an IPS and not a firewall — this leads to some idiosyncrasies and requires a different way of thinking about rules. Here’s what you need to know:
1. Stateful Rule Groups end with an implicit “pass all” — if traffic doesn’t match an explicit rule it will be passed, not blocked. This is a bit counter-intuitive to anyone expecting normal firewall behaviour of implicit “deny all”.
2. Stateful Rule Groups have a fixed processing order for actions: 1. Pass, 2. Drop, 3. Alert — it’s quite common to see a mix of allow and deny statements, ordered by priority in a most firewall rule sets. However, Stateful Rule Groups only allow ordering of rules within each action. This means that you can’t, for example, place a block rule before an allow rule.
3. Network Firewall “Domain Allow Lists” will pass specific domains then block only HTTP / TLS — You may be thinking, so what? Well, if you just use a single “Domain List” allow rule group, any traffic that isn’t HTTP / TLS (like SMB, DNS, malware C&C traffic) is allowed. This the reason I used Suricata IPS rules, instead of Domain Lists. The rules I provided will block all traffic other than HTTP / TLS to specified domains. The mechanics that I’m trying to explain are explained on this page of the AWS Network Firewall user guide.
4. It’s easy to block all traffic with Suricata IPS rules — This was a mistake I made repeatedly when tinkering with rules. You’ll notice in the Suricata IPS rules I provided above that there’s 3 odd “drop” statements at the end. Why not just put a “block ip any any” and be done with it? Well, because of the way Suricata rules work, that would also block the TCP 3-way handshake, and by extension, all TCP flows. In Suricata, flows transition from nothing to a simple TCP flow on to a rich application flow (e.g. HTTP / TLS) as the flow progresses. So back to the 3 weird drop rules at the end of the rule group, what they each do is: (1) drop any traffic to TCP ports that are not 80 or 443, (2) drop any established TCP sessions on port 80 or 443 that doesn’t match my HTTP / TLS pass rules and (3) drop everything that isn’t TCP. You’ll note that none of these rules drop a TCP 3-way handshake. There is a great AWS blog post that explains this in more detail.
5. If AWS Network Firewall doesn’t see flows in both directions it can fail open — at one point I made a routing mistake that resulted in the firewall seeing traffic only in the outbound direction (a route that allowed my NAT Gateways to send traffic directly to the client). This resulted in completely open connectivity to TCP 80 and 443. This is because Suricata never saw return traffic and never applied the “flow: established label” that would have caused the flow to be killed by my 2nd drop rule.
6. AWS Network Firewall doesn’t decode traffic from networks other than the deployment VPC by default — You may be wondering why we had to bust out AWS CLI and fiddle with a variable called HOME_NET. Suricata uses this variable to determine which networks are “internal” vs. those that are “external”. AWS Network Firewall by default, populates this variable with the IP CIDR range for the VPC it’s deployed into. We updated this variable to include our “application” VPC IP CIDR ranges so that traffic would be parsed by the Suricata protocol decoders. I’m not sure why AWS only decodes traffic from HOME_NET (and I haven’t tested this, but I’m really hoping it also covers traffic to HOME_NET as well) but I’m guessing it’s for performance reasons. If you don’t define this variable correctly, flows will never transition to HTTP / TLS (failing open, so be careful). The good news is, once set, the variable will persist through GUI rule changes! More info on how and why to set the HOME_NET variable can be found here.
I think overall that AWS Network Firewall in combination with Transit Gateway offers a pretty simple, highly scalable and resilient egress-filtering service that does what it needs to. After building this solution, I have a few more closing thoughts I want to share with you:
1. Pricing — Initially, I thought that AWS Network Firewall seemed a bit pricey ($0.395 p/attach hour). After thinking about it some more, and realising that you get a free NAT Gateway, potentially doing away with a few VPC Endpoints, it’s actually not too bad. If we opted for traditional firewall appliances we’d also have to stump up EC2 (which wouldn’t be trivial for high throughput) and subscriptions costs.
2. Rule Groups — Given the compatibility with Suricata rule sets, it should be pretty straightforward to use existing Suricata rule sets, such as Emerging Threats Open / Pro. Customers may also have custom rules, developed for their apps that can be re-used with minimal change.
I imagine as adoption of AWS Network Firewall increases that more 3rd party security vendors will likely port their threat detection rules across too. AWS might end up allowing these vendors to sell rule subscriptions via AWS Marketplace as they do with WAF Managed Rules today.
3. Comparison to traditional firewalls — For standard use cases, such as the egress-filtering solution that I’ve described in this article I think AWS Network Firewall is a great choice. I also think it‘s hard to beat on price to performance ratio for customers moving large volumes of traffic given its pricing model and level of scalability (running on AWS Hyperplane).
For more complex uses cases, traditional next-gen firewalls still have an edge owing to their breadth and depth of features. Features like threat intelligence feeds, TLS inspection, and more comprehensive management and visibility capabilities may be beneficial for some customers. I expect this gap will close over time as AWS Network Firewall evolves.
I hope you’ve found this article useful and thank you for reading!