AWS Account Migration Journey — Part 1

Sumita Mudgil
MiQ Tech and Analytics
5 min readApr 18, 2022

Trying to understand is like straining through muddy water. Have the patience to wait! Be still and allow the mud to settle. Now as I have grabbed your attention and hinted that this is going to be a long post, grab your coffee and follow along!

Background

Security is a process and not a product, going with this mantra, we at MiQ thrive to achieve the best security for our applications in the cloud.

We had all the workloads merged in the same AWS account. To maintain separation of concerns and workload governance, we first migrated Dev workload to a new AWS account and then targeted to migrate all our workloads from the current VPC to a new VPC in AWS. Switching to a new VPC was not easy when applications had dependencies on each other and we needed to achieve this migration with minimal risk and downtime.

We have 70+ microservices categorized in Tier 1,2 and 3 (deployed over Kubernetes cluster and individual EC2 instances). Services are both public and private facing using a variety of databases ( AWS managed and standalone databases maintained in Kubernetes). The migration plan had to consider all these factors.

I could walk you through the architecture of different services but that is beyond the scope of this blog. For now, here is a high-level view of MiQ AWS infrastructure.

Acceptance Criteria

For sake of simplicity we defined the acceptance criteria for the migration activity which were:

  1. Minimal downtime was acceptable for tier 2 and other non-critical applications
  2. No communication from the old VPC to the new VPC was allowed (I will talk about exceptions later but this was to mitigate security risk on the new VPC)

The migration plan was thought through and discussed in detail, and each step was scripted for execution to avoid surprises and failures. The plan was tested first with the development environment and a parallel new VPC in the development account. Failure would most likely result in a negative business impact or extended service downtime.

Due to the involvement of Tier1 services and dependency between different services, data migration strategy had to be one of the crucial aspects.

Setting Off

For VPC migration involving multiple teams(We have close to 8 teams that work on different problems), we had to depend on the technical leaders of the teams to list down all their services, metadata like tiering, and dependencies on other services, dependencies on other resources. The technical challenge had to be overcome by precise planning and crisp execution but the question was where to begin?

We started by exploring options available to us for establishing communication between 2 VPCs, keeping in mind that old VPC to new VPC communication wouldn’t be allowed.

VPC Communication Options

VPC Peering

From AWS docs, A VPC peering connection is a networking connection between two VPCs that enables you to route traffic between them using private IPv4 addresses or IPv6 addresses. Instances in either VPC can communicate with each other as if they are within the same network. More on the same can be found here.

AWS PrivateLink

AWS PrivateLink is a highly available, scalable technology that enables you to privately connect your VPC to supported AWS services, services hosted by other AWS accounts (VPC endpoint services), and supported AWS Marketplace partner services. You do not need to use an internet gateway, NAT device, public IP address, AWS Direct Connect connection, or AWS Site-to-Site VPN connection to communicate with the service. Therefore, you control the specific API endpoints, sites, and services that are reachable from your VPC. More on the same can be found here.

Option Finalised — AWS PrivateLink

Why?

This option ensured that only required services were made available to be accessed from the new PVC to the old VPC and all other services and resources remained inaccessible. It also helped in simplifying the communication by allowing one-way traffic from the new VPC to the old VPC.

Terminology

For the simplicity we will follow the below names in this article:

OVPC — Current(Old) production VPC

NPVC — New production VPC

FQDN — Fully Qualified Domain Name

Upstream Services — Services on which other services depend e.g messaging-service

Downstream Services — Services that depend on other services e.g. ingestion-service

How?

So if there is an application in OVPC and we want to access it from NVPC we follow the below steps

  • Create an endpoint service in OVPC for the service(this will be a load balancer as this is how we expose our apps)
  • Then we create an endpoint in NVPC and link to the EndpointService created above
  • Calling this endpoint from NVPC will allow direct access to the application running in OVPC

Accessing service across VPC with minimum changes

If an application in NVPC wants to connect to an application in OVPC without changing anything, it needs the existing FQDN of that application to resolve to the endpoint created in NVPC and everything should work.

Except that if someone in OVPC was connecting to the same application as well they will experience failure because this endpoint will not be accessible from OVPC as it is created in NVPC.

To solve this what we need is both the VPC for the same FQDN to return different entries.

  1. If the applications are within the same VPC then it should return the LB
  2. If the application to be accessed is in another VPC then it should return the VPC endpoint

Here in the diagram below, notification-service and ingestion-service depend on messaging-service. notification-service is migrated to NVPC while ingestion-service and messaging-service are in OVPC.

In order to access messaging-service(OVPC) from ingestion-service(OVPC) we use the old URL which is mapped to the load balancer.

To access messaging-service(OVPC) from notification-service(NPVC), we create an endpoint service in OVPC for messaging-service, create an endpoint in NPVC and map the URL to this endpoint.

To achieve this we will create private hosted zones with the same name which are .miqdigital.com and .mediaiqdigital.com since these are resolved only within their VPC we can have different entries for the same FQDN.

This concludes the first part where we have gone through the available options for the migration and finalized the AWS PrivateLink approach.

Now that we have the background and the context, the next chapter of this blog series covers the actual migration process, so stay tuned.

--

--