Building a Microservices API in AWS
The advantages, disadvantages, problems, and solutions
Architecturing a microservices app in AWS is a challenge. AWS offers a variety of services that can be used to address a wide range of problems. Putting those services together or building custom solutions is always a question that comes with a heavy analysis.
This post is the first of a series in which we’ll discuss an approach to building a microservices API in AWS, the advantages, disadvantages, problems, and solutions.
Given that we were building an API using microservices, we chose to use AWS API Gateway. One of its features that we valued most was the ability to throttle requests per endpoint and the ability to integrate with AWS Cognito, our login/signup provider.
In addition, API Gateway allows a request to be mocked, forwarded to an existing HTTP endpoint, execute a Lambda function or invoke an existing AWS Service. Therefore, we can build our services however way we think is best. In addition to gaining control over which requests end in which service. This looks something like:
Login / Token Management
AWS has a service that provides an easy user sign-in and sign-up: AWS Cognito. This service has recently launched User Pools which provide authentication with external providers and MFA as well as email and phone verification.
To customize within Cognito, there are triggers that execute a Lambda function on certain events. For example pre-registration confirmation message customization, custom authentication challenges, etc.
When logged in, Cognito gives the client three different JWT: access token, ID token and refresh token. The ID token can be sent to API Gateway to authorize requests. To do so, follow this documentation on the AWS’ portal.
Using API Gateway + Cognito allows authentication at the API Gateway level, taking this responsibility away from the servers.
When building an application it is essential to keep in mind security. When we sat down with our security experts, they required for no server to be publicly available. This means that no request could reach our servers directly from the internet. After applying our security experts’ recommendations we ended up with the following architecture:
This brought to our attention one of API Gateway’s disadvantages. It cannot forward requests to a server that is placed inside a VPC that does not allow requests from the internet. The previous schema where API Gateway forwarded requests directly to our servers was now broken.
API Gateway + Lambda Proxy
Luckily, there is a way to make this combination work. API Gateway is always able to forward a request to a Lambda Function and this function can live inside any VPC you choose. This solution looks something like this:
This stack is better explained in this AWS post.
One important disadvantage you may want to consider before choosing this option is that the Lambda function will only forward the request to the server once it receives the entire request. Similarly, the Lambda function will only return the response to API Gateway once it has received the entire response from the server. You can not stream anything in this way.
Advantages and drawbacks
As this was the first version of our architecture we were able to take some tests and noted some points:
- The API Gateway + Lambda proxy works as expected. It does not add significant latency to requests and allows you to have a secure architecture in which your servers can only be reached from inside the servers subnet.
- Detaching the authentication method from our services is something we advise you do. Not only does it take a load off the servers but also gives you the possibility of changing the authentication method without your servers.
- Cognito handles signup, login, email and phone confirmation, and MFA. All of these are provided out of the box and can be easily customized with triggers. For a simple signup/login system, this is more than enough.
- API Gateway provides logs that can help you easily track any issue with requests.
- Server’s logs can be streamed to kibana to provide centralized logging
- This is a complex architecture. Understanding all possible points of failure is not easy to do and takes some time to fully understand. Additionally, tracking errors through different applications and services can be complicated.
- Managing a subnet’s ACLs is complex. Each server has different needs and therefore requires specific rules to allow only necessary traffic in and out. When your servers access the internet they do it through a NAT Gateway. AWS’ NAT Gateway changes some ports and that makes the rules even more difficult to build.
- Cognito’s ID token expires 1 hour after it is issued. The refresh token expires 24 hours after it is issued. There is no way to change token’s duration if your application requires shorter token life. If you want to do so, you will need to build a custom solution, something I’ll be addressing in the next post.
This is a basic microservices architecture you can try in your next project! It’s been a challenge to build it and get used to debugging and tracking errors, but the final result is great.
To find out how we limited Cognito’s tokens, hang tight for my next post.