AWS VPC: building-blocks and principles (part 1)

Reviewing the VPC components, their features and configuration, one step before an actual practice

Published in

DevTechBlogs

16 min readAug 11, 2018

Defining a network is fundamentally important for building and managing cloud resources. AWS provides a set of capabilities for implementing and maintaining your own private network by using the VPC toolkit. This blog-post aims to present the underlying VPC’s components, their configuration and behaviors, as well as the essential knowledge how to use them to define a VPC network.

What’s on the menu?

This post can be useful as a short intro for VPC. I hope those who are more experienced with AWS will find it enriching as well. This topic is huge, so I tried to extract the essence of the VPC’s building-blocks.

So, what’s on the menu?

Part 1: VPC’s main components: VPC, subnet, route tables, security groups and Network ACL.
Connectors to the internet: Internet gateway or NAT gateway.
IP address principles (as a short memory refresh, for those who graduated many years ago).
Part 2: Configuring a VPC network that is composed of two subnets with EC2 instances, connecting it the network to the internet and setting its traffic rules.

At first, I planned to include both parts in one blog-post, but the content just poured and it turned out to be a quite long post. Eventually, I divided it to two parts: principles and practice. If you’re already familiar with VPC’s building-blocks, you can jump right to the guide in the second part.

Ready? Let’s get started!

VPC’s Main Components

What is VPC?

VPC stands for Virtual Private Cloud. In fact, it is your own private network inside your AWS assets. It resembles the network that you would configure and operate on the traditional on-premises data center. VPC’s configuration encapsulates the network and the communication definitions under a certain group of resources.

VPC can be configured in one region and cannot cross regions. AWS allows us to define up to 5 VPCs in the same region. By default, VPC spans across all the Availability Zones (will be referred as AZ) in the region. For specifically spreading a network to more than one AZ, you need to configure subnets across the region.

If you browse in your AWS account to various regions, you’d probably notice that each region already has VPC. That’s your default VPC per region, which already has subnets created by default for you; a subnet for each AZ in the region.

Moreover, AWS has created a default route table, network ACL and security group as well (don’t worry, we’ll dive into each of these components later in this blog post). These default resources allow you to immediately start launching AWS services that require basic VPC components, without the hassle to define them a priorly. For example, the steps to configure an EC2 instance include assigning VPC, subnet and security group. You can assign the default resources to facilitate the setup process and save time.

Similarly, once you create a new VPC resource, AWS provisions a default route table, security group and Network ACLs, but leaves the subnet creation to you, as it depends on your desired architecture. There are configuration differences between the default VPC components and the VPC you create, which will be covered later.

Subnet — definition and types

A subnet is a sub-section of the VPC network. Each subnet has an IP address range that isolates its resources. Each subnet must reside in one availability zone (AZ), whereas VPC can be spread across the region.

You may be asking yourself what’s the benefit of using subnets? Why not hold all components under a subnet that covers the whole VPC? Well, there are several reasons:

Security: Segregating resources from others allows you to control, maneuver and tighten their incoming/outgoing traffic as well as the authentication rules. Some traffic rules can be granulated and controlled on the subnet level (network ACL). Another aspect of security is logging the traffic and identifying potential or actual breaches. It can be done efficiently when the sensitive resources reside in isolated subnets.
Logical differentiation facilitates network management and maintenance: When you have only a few instances of AWS resources, managing your network is a simple task. However, when you have tens or even hundreds of components, this task becomes much harder. Therefore, organizing your network in groups can save you a lot of headaches. Moreover, this separation allows to colour a purpose for each subnet, for example, a subnet for production and another for testing purposes.
Redundancy and fault-tolerance: AWS best-practice is to consider redundancy for any valuable resources and avoid, as much as can, from a single point of failure. Designing redundancy between two or more availability zones is achievable only when the network is spread in more than one availability zone. Therefore, setting several subnets improves our system’s redundancy and enables deigning a fault-tolerance system.
Network expansion: subnets allow the virtual network to expand to more availability zones by allocating more IP addresses. Expansion can emerge from a higher load of existing resources (scale-out) or organic growth and the need for additional resources.

In terms of communicating with external networks, a subnet can be classified as public, private or VPN only:

Public subnet means the traffic can be routed to an internet gateway. Such definition resides on the route table that is associated with the subnet. Resources in a public network have a public IP address. You can allow auto-assignment of IP address by setting the flag “Enable auto-assign public IPv4 address” to true. By that, it requests a public IPv4 address for all instances launched into this subnet. This definition can be overridden as part of configuring the instance attached to the subnet (such as EC2) by choosing Enable or Disable.
A private subnet doesn’t have a route to the internet. All traffic rules are limited to other subnets within the VPC.
If a subnet‘s traffic is routed to a virtual private gateway for a VPN connection, it’s called a VPN-only subnet.

Route tables: traffic navigation control

Route table (RT) is a set of rules that determines the allowed traffic paths from one network to the other. These rules allow opening connections between networks. AWS creates a default route table for every VPC, which is highlighted as “main”. There can be exactly one main route table for every VPC, while this feature can be assigned to other route table attached to the same VPC. The default route table rules the routing for all subnets that are not explicitly associated with any other route table under the same VPC.

A route table must have at least one fixed rule that allows communication within the VPC and by that AWS prevents a subnet isolation scenario.

There can be more than one route table associated with VPC, but only one “main”. The incentive for configuring more than one route table is to differ the communication definitions of subnets, for example: permitting one subnet to connect the internet gateway while keeping the others without such routing rule. This is the way to create public and private subnets definition.

A subnet can be associated with only one route table in order to avoid ambiguity. AWS doesn’t block the option to select a subnet that is already associated with other RT. However, after the save operation is completed, the subnet is removed from the previous RT and assigned to the selected RT without prompting any message to the user. So, be aware of this behaviour.

To facilitate the selection of available subnets, AWS presents the subnet without any associated RT under:

What about deleting route tables? To avoid orphan communication definitions, AWS allows deletion of route table just in case it has no dependencies and as long it is not the main route table, meaning it has no associated subnets.

Click here for AWS route tables user guide for more information.

Managing security traffic rules

Under VPC, there are two main factors that govern the VPC’s communication internally and externally. These are the Security Groups and the Network ACLs.

Security Groups

A security group holds a list of rules for allowing incoming and outgoing traffic to and from its associated resources. Practically, a security group is similar to the firewall (FW) on the instance level, although there are some differences between a traditional FW and security group that will be covered shortly.

A security group can be attached to only one VPC. Similarly to route tables, once a VPC instance is created, it already has a default security group automatically. This security group is associated by default to the resources in the VPC. For example, upon setting-up an EC2 instance, we need to choose a security group that is associated with the VPC. There must be at least one security group since the default security group cannot be deleted.

The default security group of each VPC has a default inbound and outbound rules that allow all traffic. Contrastingly, a newly created security group will have no inbound rules but only allow all outbound rule. It’s important to remember this behaviour after associating instances to a non-default security group.

AWS has found a creative way to relate security groups with other resources. A security group can be associated with one or more AWS resources, such as an EC2 or a load-balancer. Furthermore, one resource can be associated with many security groups. By that, a resource can “inherit” many rules from many security groups. This many-to-many relation allows an easy way for grouping of rules.

The screenshot below explains how to attach/detach a security group to/from an EC2 instance:

As opposed to other firewall products, a security group’s rules always define “Allow” traffic, there are no “Deny” traffic rules. All the rules are positive, therefore there is no conflict with assigning many security groups to the same instance. Since all the rules are enabling communication, not blocking it, the permissions will accumulate.

Another difference from traditional firewall products is lacking the rule’s destination. The actual destination is the instance that is associated with the security group, so the reference definition for “incoming/outgoing” is the resource itself (for example the EC2 or ELB instance).

One trivia fact about the security group’s policy is the blocking of ICMP traffic. By default, AWS blocks ping communication although the “All Traffic’” rule applies, and thus, to receive ping response you should to explicitly add a rule to allow it. I’ve forgotten this fact while trying to check my EC2 instance availability, which cost me 5 precious minutes 🙃.

Allow ICMP traffic in the security group rules

Security groups are stateful, which means if a request was sent from an instance (based on an outbound rule), the response traffic for that request will not be blocked, regardless whether the existence of a specific inbound rule to allow it. The same applies for the opposite direction: responses to allowed inbound traffic are not blocked from flowing out, regardless of the defined outbound rules.

If you’re interested to read more about security groups, click here for the AWS security groups guide.

Network ACLs

Network Access Control List (in short will be referred as Network ACL or NACL) is a set of communication rules to control the traffic to and from a subnet. It provides supplementary features on top of the security groups’ rules.

Unlike security groups, NACL resource is associated with subnets and not AWS’ resources. By design, each subnet is associated with exactly one NACL, and thus assigning a subnet to other NACL will remove it from its prior associated NACL.

Another difference between security groups and NACL is the allow/deny rule’s feature. NACL rules can be specified as either “allow” or “deny” access definition.

NACL rules allow/deny traffic from entering a subnet. Inside the subnet itself, other AWS resourced can have an additional security layer by defining the security groups’ rules.

More NACL features and behaviours:

The default VPC’s NACL cannot be deleted.
A NACL subnet association list can be empty, subnets with no specific associated NACL will follow the default NACL’s rules.
The default VPC’s NACL has “allow all” rule, whereas any new additional NACL will have a “deny all” rule. Therefore, after associating a subnet to a newly created NACL, don’t forget to change this default rule.
NACL is stateless, as opposed to a security group. It means that replies to allowed incoming requests (inbound) will be approved/denied based on the outbound rules.
The order of the NACL rules is very important. To facilitate this configuration, each rule has an order property (“rule #”). The asterisk rule (“*”) means it catches all the rules that were not caught before by higher-ranked rules.

Since the rules are evaluated based on their order, then the most general rules should be ranked lower than the specific ones, otherwise, it will miss the purpose. In the example below, although the rule to block ICMP traffic exists, the ICMP traffic will flow because the generic rule to allow all traffic has precedence due to its higher-order:

Want to enrich yourself with more information about Network ACL? Refer to AWS user guide.

Evaluation of traffic rules: Which has precedence?

So, there are two distinct components that define the traffic rules. For incoming traffic, the inbound evaluation order is:

NACL rules are evaluated first based on their order (as mentioned, a subnet must be associated with only one NACL).
Security group’s rules are evaluated afterwards. Their order doesn’t matter since all the rules are allowing the traffic to flow.
The incoming traffic is blocked if no “allow” rule has been found.

The outbound evaluation order is the opposite order: first the security groups’ rules and then the NACL’s rules.

Amazon advises using security groups for whitelisting traffic and NACL for blacklisting traffic.

A Short pause to breathe; What have we covered so far?

The sample diagram below outlines almost everything we’ve discussed so far: a VPC network with two subnets, each in a different availability zone. It emphasizes the traffic control order: first at the network level (NACL of each subnet) and then the security groups of the resources that reside in the subnets.

After taking a short breath, our next step is to connect the VPC to external networks.

Connecting our VPC to the internet

AWS exposes several options to connect a VPC network to other networks, such as the internet, other VPC or VPN.

For accessing the internet, AWS provides managed resources (Internet gateway and NAT gateway) or resources managed by you (NAT instance). I’ll describe only the first type and leave the NAT instance for another blog-post.

Internet Gateway

Internet gateway is a two-way directions bridge to the internet. It is a managed AWS resource, as such it is horizontally scaled automatically, redundant and highly available.

Once a gateway is attached to VPC it cannot be attached to another one. To change the connection between a gateway and VPC you must detach it from the previous VPC and attach it to the new VPC.

Once an internet gateway is attached to VPC, you can configure it as a valid route in the route table. Afterwards, this route will appear as “active”. However, if the internet gateway is detached from the VPC then the route table will indicate the routing as a “black hole” since the destination (the gateway) is not connected to any network.

An internet gateway, attached to route table, with no VPC

A subnet with access to the internet is called a public subnet. To control the internet access and allow it only to specific subnets, you need to alter only the route tables of these selected subnets, while leaving the other subnets without any configured route to the internet.

The destination IP is configured to be any address to ensure the routing of the internet gateway can be across the whole VPC rather than limited to a certain IP range. Anyway, AWS blocks you from defining a more specific destination than the VPC’s IP address.

Once the route has been defined, controlling the specific traffic rules (protocols, ports, source or destination) is done by amending the security group or the NACL of the subnet.

NAT Gateway

Network Address Translation (NAT) gateway enables instances that reside in a private subnet to connect to the internet (or other AWS services), but prevent the internet from initiating a connection with those instances. It can be useful for downloading patches into our private subnet resources, for example. Since the NAT gateway is a managed AWS resource, it is redundant, automatically scale-up and being patched by Amazon, as opposed to NAT instance, which is managed by the end-user. Describing NAT instance is out of the scope in this blog-post, click here to read more about the difference between the two.

There are several steps to configure a private subnet that connects to the internet via NAT gateway:

Firstly, you need to have VPC with at least two subnets, one should be public and the other private. Afterwards, you need to create a new NAT gateway resource and associate it with the public subnet (since it needs an active route to an external network). Please note, you must have an Elastic IP (EIP) object attached to the NAT gateway. In case you haven’t configured it before, you will be able to create a new EIP instance as part of the NAT gateway creation wizard’s screens.

Secondly, as part of configuring the routing, the private subnet’s route table should have a rule that allows a connection to the NAT gateway:

Lastly, to check the NAT gateway works properly, you need to try accessing the internet from an instance located in a private subnet. This whole scenario is covered in the AWS short demo.

One last note, if you delete a NAT gateway but keep a routing rule in one of the route tables, the route status becomes “Black Hole”, as can be shown below:

For more information about NAT gateway, see AWS NAT gateway.

IP address: can you please remind me the basics?

After reviewing the building blocks of the VPC and before diving into the actual implementation of one, let’s go back a bit and refresh our memory pertaining to IP address principles.

It might seem a bit tedious, but it is essential to grasp and understand the underlying ideas of IP addresses and subnet masks to prevent misconfiguration problems. AWS doesn’t allow altering or modifying the VPC’s IP address, and thus a correct configuration is crucial to avoid scenarios that may force you to reconfigure the whole network.

So, this section can be a short brief for those who were not awake during their bachelor degree courses or those who have forgotten 🤓.

IP address is composed of 4 numbers, each between 0 and 255. Each number represents 8 bit (2⁸ = 256), for example: 172.16.254.1

The IP address is divided into two parts: the network prefix and host number. The prefix represents the network, whereas the rest defines the host address. The dividing range is defined by a subnet mask, which separates the address to network prefix and the host’s IP. The network prefix is a result of a bitwise AND operation between the IP address and the subnet mask.

The table below demonstrates an IP address (192.3.2.131) with the subnet mask 255.255.255.0:

The example above shows a classful address (8, 16, and 24-bit network masks), but subnet masking is more complicated when it is not represented by full bytes.

Subnet masks can be represented in 4 bytes structure or in CIDR format with /n after the IP address (CIDR stands for Classless Inter-Domain Routing). The CIDR block (/n after the IP address) is a value representing the total number of network ID bits. For example, reserving the first 26 bits for the network address creates 4 optional subnets:

In the example above, the last 6 bits are available for the host address, meaning 62 (2⁶ –2) addresses are available. There are two addresses that are not allowed (all ones and all zeros are not allowed), as these addresses are being used by IPv4 protocol.

So, what is the incentive to use CIDR notation and not traditional subnet mask? That’s because CIDR reduces the problem of wasting address space by providing a flexible way to specify network addresses in routers. CIDR notation lets one routing table entry represents an aggregation of networks that exist in the forward path without the need to specify a particular gateway. Therefore, if CIDR addressing is used, a single entry can be used to represent a group of networks, which eventually reduces the number of entries in the router. As a result, routers work more efficiently while navigating the traffic.

AWS IP address format is IPv4 with CIDR notation. The CIDR block of a subnet can be the same as the CIDR block of its VPC (for a single subnet in the VPC), or a subset of the CIDR block of the VPC (for multiple subnets).

CIDR overlapping is not allowed, therefore you must carefully allocate the CIDR blocks in case of creating more than one subnet in VPC.

AWS reserved IP addresses

AWS prohibits using the first four IP addresses and the last IP address in each subnet. Assuming the network IP is 20.0.0.0, these five addresses are used for:

20.0.0.0: Reserved for a network address.
20.0.0.1: Reserved by AWS for the VPC router.
20.0.0.2: For VPC, this IP address is reserved for the DNS server, but AWS decided to reserve it for subnets as well.
20.0.0.3: Reserved by AWS for future use.
20.0.0.255: Reserved for network broadcast address (although AWS doesn’t support network broadcast in a VPC network).

Closing notes before moving forward to the practice part

That was a lot of information packaged in one blog post, so give yourself a round of applause. On the one hand, I tried to cover the basics and beyond that, and on the other hand tried not to dwell upon it.

The best way to imbibe information is through practical experience, therefore I encourage you to proceed to the next blog post.

Hope you enjoyed reading this blog-post and that it helped you gain more knowledge. Please don’t hesitate to give feedback or simply clap your hands 👏.

Lior