Migrating a WSGI Python application to Serverless on AWS

Osaetin Daniel
limehome-engineering
5 min readJan 21, 2022
Migrating a WSGI Python application to Serverless

At Limehome, we recently migrated an existing WSGI-based Python application to the Serverless platform on AWS.

I’m Daniel (Senior Full Stack Engineer) and in this article, I have put together some problems and solutions based on my experience

This article also makes a lot of reference to the serverless framework and its ecosystem but the solutions discussed here can be applied without the serverless framework.

1. Binary Types and base64 encoding

This was the first problem that we noticed and it’s pretty easy to miss this since it happens at runtime and it doesn’t prevent the application from running.

In our case, our fonts were returned as base64 encoded data instead of binary and so the fonts weren’t rendered properly because the browser didn’t understand them.

API Gateway cannot send or receive raw binary data to a Lambda function. To send binary data, API Gateway encodes the binary data to base64 before invoking the lambda function

The reverse is also the case; To return binary data from a Lambda function, you have to encode it tobase64, set the appropriate Content-Type header and add it to the binaryMimeTypes in API Gateway.

Why did this happen?

serverless-wsgi (which bridges the gap between WSGI and API Gateway) was the culprit here as it was encoding the font’s response body to base64 while we didn’t have the font mime-type in binaryMimeTypes .Without this, API Gateway just treats the response content as text and doesn’t decode it to its original representation.

2. Deployment Package size

There is a hard limit of 50MB for a compressed deployment package with AWS Lambda and an uncompressed AWS Lambda hard limit of 250MB.

Although we didn’t hit this problem immediately, we eventually hit it in our CI server because it generated some cache and build artifacts that eventually made it into the final serverless package.

If you experience this problem before your first deployment, there are various strategies to reduce the package size:

  • Include only the code used by your function: Your lambda function should only contain assets/resources that are used by your application. There’s no reason to have an env.example file in your package if it’s never going to be used. The serverless framework allows you to control the files that are included in the final deployment package with the includes, excludes and patterns keywords
  • Optimize your dependencies: You should review your dependencies and see if some unnecessary/dev packages can be excluded from the deployment package.
Thankfully, pip doesn’t generate as much cruft as npm and yarn
  • If your codes makes use of boto3(which is a big package by python standards) You should also be aware that the standard Python environment from AWS already includes boto3. Removing boto3 could save ~60MB and a few seconds of packaging/deployment time.
  • Use S3 to store assets: Assets like Images, JSON files, etc should be moved to S3 or a similar storage service. You can also move big binaries to S3 and download them again when your function starts. This is a last-ditch effort that should be avoided if possible because it could slow down the cold-start of the lambda function

3. Domain Management

Unlike a Traditional Setup with a CNAME or A Record that points to a public-facing WebServer or load balancer with a public IP Address. APIGateway doesn’t provide a public address that you can simply point to from your Domain Registrar’s Control panel.

In Order to use a Custom Domain with AWS there are three steps:

  1. Create an ACM certificate for the domain (a wildcard certificate is better for multiple subdomains)
  2. Map the domain to an API Gateway stage.
  3. Point the domain to mapped API Gateway stage

serverless-domain-manager greatly simplifies this process and if you use AWS Route53, it can even create a Route53 Record!

It’s also worth mentioning that the domain’s certificate must be created in the us-east-1 region If you’re using an Edge Optimized deployment in API Gateway. This is a leftover from the days when AWS only had one region. As you can already guess, this was us-east-1

Using a domain from another region will not work!

4. Payload limits (File upload limits)

There is a hard limit of 6MB when it comes to AWS Lambda payload size. This means we cannot send more than 6MB of data to AWS Lambda in a single request.

You will definitely run into this limit if you upload files directly to your API before uploading them to S3.

We circumvented this limit by using S3 pre-signed URLs for file uploads. This gives the client a pre-signed URL that accepts a file and saves it S3 directly.

pre-signed URLs are a great solution overall for uploading large files. Even if you don’t use Lambda

5. IAM Roles

If your Web app interacts with the AWS API; directly with AWS-CLI or a library like boto3, you might be surprised to find out that these calls don’t work anymore and it returns some form of PermissionDenied error Even if your AWS token has access to these resources.

This is because AWS Lambda (like all other AWS Components that have the ability to call other services) needs permissions to do so.

This acts as an extra layer of security and prevents privilege escalation because Malicious code cannot arbitrarily access other AWS resources if the Lambda function itself cannot access them.

Conclusion

In this article, we talked about some of the problems we faced when migrating a monolithic WSGI-Based Python app to the serverless platform.

Although I didn’t go into many details, this was done on purpose to keep the article as accessible as possible.

Serverless computing has its benefits but it also comes with various challenges. Some of these challenges like Payload limits are Artificial challenges because of specific AWS limitations and engineering decisions while others are Fundamental as they’re baked into the serverless computing model.

It’s the responsibility of a good Software Engineer to weigh the pros and cons and decide if serverless is a good fit vs a Traditional server setup or a Hybrid PaaS like Heroku.

By the way, check out the open roles in our Tech and Engineering team.

--

--

Osaetin Daniel
limehome-engineering

Full Stack developer: Python, JavaScript, Go and some other things here and there.