Next-Gen Cloud Network Security: Design Notes

Published in

Google Cloud - Community

7 min readSep 16, 2024

Welcome back to the NGFW Enterprise series!

In prior episodes, we covered the fundamentals of NGFW Enterprise and the nuances of TLS inspection. If you’re new here, it’s highly recommended you catch up on those articles before diving into this one.

To wrap up this series, we’ll examine some key considerations regarding how NGFW Enterprise influences our overall architecture and how to strategically design our Firewall Policy rules.

Architecture Changes

When I first encountered NGFW Enterprise, I wondered how it would transform a typical hub-and-spoke customer architecture that relies on traditional Network Virtual Appliances (NVAs).

Here’s a visual representation of a standard architecture for the majority of enterprise customers using NVAs:

A “classic” hub-and-spoke architecture, leveraging Network Virtual Appliances (NVAs).

The infrastructure is divided into an untrusted zone and a trusted zone, connected via NVAs for routing, IPS, and traffic filtering.

The untrusted zone is a single VPC for internet access components (e.g., Cloud NAT) or access from untrusted sources (e.g., Cloud Load Balancers).

The trusted zone has a hub centralizing spoke connectivity and connecting them to the untrusted zone.

Hybrid connectivity lands either in the untrusted or trusted hub, based on its trust level.

Common requirements:

Default deny ALL ingress traffic (L3, L4), usually GCP’s default.
Allowed traffic between environments and the Internet (north-south, east-west) undergoes IPS inspection.
Within an environment (spoke VPC), traffic is denied by default. Allowed traffic is usually not inspected.
Optionally, enable IPS inspection between specific sources/destinations, even within the same environment.

How does NGFW Enterprise transform the network architecture?

How the previous architecture changes when migrating to NGFW Enterprise.

The two hub VPCs now combine into a single entity since there are no more NVAs. With the elimination of the untrusted area (at least as a separate VPC), the components responsible for ingress and egress (such as Cloud NAT, Cloud Load Balancers) and the IPS functions are relocated to the application VPCs.

It’s important to remember that deploying NGFW Enterprise in the hub won’t directly impact traffic flows because NGFW doesn’t filter in-transit traffic.

This architectural shift essentially renders the hub an optional component. As a result, some users opt to directly land hybrid connectivity in the application VPCs, simplifying the architecture and avoiding potential VPC peering transitivity issues.

While a hub might still be necessary for functions like DNS forwarding and Private Google API access, that’s a topic for another discussion, beyond the scope of this article.

Firewall Policies design

When I embarked on my first NGFW Enterprise project, I naively assumed adding IPS functionality would be as simple as tacking on a few rules to my existing firewall policies. However, it turned out to be anything but straightforward. Rather than just presenting the final design, I’d like to walk you through the process, step-by-step. This way, you can better understand the challenges I faced and the rationale behind my decisions.

Using the new model discussed earlier, let’s consider a sample architecture with two VPCs: one for development and one for production. These VPCs are interconnected, but the specifics of that connection aren’t relevant here. Within each VPC, we’ll deploy some VMs and components for internet egress and external access to certain VMs.

Let’s break it down and see what unfolds.

Step 1

Prod — Ingress Policy Rules
1 — 0.0.0.0/0 to all instances → Apply Security Profile Group (SPG)

We introduce a rule to inspect all traffic originating from the internet and other environments.

However, this inadvertently overrides the implicit rule to deny ingress traffic, resulting in all instances being exposed to both internal and external traffic. Although all traffic is now subjected to inspection, this outcome doesn’t align with our original intent.

Step 2

Prod — Ingress Policy Rules
1 — A to B → allow
2 — C to D → Apply SPG
3 — E to A → Apply SPG
4 — RFC1918 to all instances → deny
5–0.0.0.0/0 to all instances → Apply SPG

We insert some higher-priority rules to selectively permit specific intra-VPC traffic without inspection, allow specific intra-VPC traffic with inspection, and allow specific cross-VPC traffic with inspection. Finally, we deny any other RFC1918 traffic.

However, this still leaves all hosts from the internet able to communicate with all VMs.

Step 3

Prod — Ingress Policy Rules
1 — A to B → allow
2 — C to D → Apply SPG
3 — E to A → Apply SPG
4 — RFC1918 to all instances → deny
5–0.0.0.0/0 to A,B → Apply SPG
6–0.0.0.0/0 to all instances → deny

We restrict what instances can receive traffic from the Internet.

Traffic to the Internet remains uninspected.

Step 4

Dev — Egress Policy Rules
1 — All instances to 0.0.0.0/0 → Apply SPG
2 — VPC firewall policy prod

We now ensure that traffic destined for the Internet is inspected.

Additionally, we configure inspection for internal traffic, including intra-VPC communication.

Step 5 (final)

Dev — Egress Firewall Policy Rules
1 — All instances to RFC1918 → allow
2 — All instances to 0.0.0.0/0 → Apply SPG

Prod — Ingress Firewall Policy Rules
1 — A to B → allow
2 — C to D → Apply SPG
3 — E to A → Apply SPG
4 — RFC1918 to all instances → deny
5–0.0.0.0/0 to A,B → Apply SPG
6–0.0.0.0/0 to all instances → deny

We introduce an egress rule to allow all RFC1918 traffic. Filtering for RFC-1918 is already done in ingress. All other egress traffic (to the Internet will be inspected).

The model intentionally:

Avoids inspection at the egress: this is particularly true for RFC-1918 traffic, minimizing unnecessary inspection of egress traffic.
Minimizes egress rule modifications: Stresses making changes primarily in the ingress chain, promoting a centralized point of control and reducing the need for constant egress rule adjustments.
Doesn’t utilize Hierarchical Firewall Policies: While acknowledging their value, it prioritizes simplicity and reserves Hierarchical Policies for their intended use cases, avoiding unnecessary complexity.

Firewall rules: common pitfalls

I often come across some common errors users make when configuring their Firewall Policy rules. I’d like to share these with you to help you avoid making the same mistakes in your deployments.

Avoid inspecting traffic twice

In the scenario illustrated below, traffic flowing from VM E to VM A undergoes inspection twice. This redundancy introduces unnecessary latency and increases costs.

Dev — Egress Firewall Policy Rules
1 — All instances to 0.0.0.0/0 → Apply SPG

Prod — Ingress Firewall Policy Rules
1 — E to A → Apply SPG

Inspect traffic for no reason

Avoid unnecessary traffic inspection. It leads to needless latency and additional costs without any tangible benefits.

Dev — Egress Firewall Policy Rules
1 — All instances to 0.0.0.0/0 → Apply SPG

Prod — Ingress Firewall Policy Rules
1 — E to A → deny

Setting up unreachable rules

In this scenario, traffic from E to A would invariably be inspected because it would never match rule number 2.

Dev — Egress Firewall Policy Rules
1 — All instances to 0.0.0.0/0 → Apply SPG
2 — E — A → Allow

Conclusions

Migrating to NGFW Enterprise requires a shift in thinking about network security architecture. It’s not just about swapping out old firewalls for new ones; it’s about adapting to a model where inspection happens at the workload level, not at the perimeter.

This new model offers greater flexibility and scalability, but it also demands a more granular and deliberate approach to Firewall Policy design. It’s crucial to carefully consider the flow of traffic between and within VPCs, and to avoid common pitfalls like double or unnecessary inspection.

As organizations continue to adopt cloud-native architectures, NGFW Enterprise will become an increasingly important tool for maintaining network security. By understanding the architectural changes and policy design considerations involved, you can ensure a smooth and successful transition to this new paradigm.