CodeX
Published in

CodeX

CDK, Python and moto

Mocking the AWS infrastructure from CDK CloudFormation configs

Cloud development kit, or CDK, is Amazons second try after arguably failed CloudFormation to create a comfortable and functional deployment framework. Here is how you can test it.

CloudFormation: say hello

While web development shifts slowly towards serverless architecture, the first and only AWS Lambda is the core of the Amazon’s take on the historical turning. Logically from bare metal, to virtualization, then containerization, the details of the physical computational environment fade away, giving the developers an opportunity to concentrate on the domain details of the problem. At least in theory, of course. Similar shift happened to in personal computing decades ago, allowing the operational systems to work in the domain of virtual memory for example, where the application developers were not concerned about the memory allocation on the physical devices anymore.

Saasing a function, the core of any program, pushes the developers of distributed applications to use the native SaaS solutions for typical tasks: ElasticCache for caching, RDS for relational databases, DynamoDB for big document-based databases, EFS for local file system. Amazon services set up resemble a distributed computer. And similar to the docker approach, this needs a declarative infrastructure description, so Amazon came up with CloudFormation, a vast cloud configuration engine based on so-called templates. Templates can be written using JSON or YAML. Here is an example of a DynamoDB database using CloudFormation template.

Cloud formation Json template for DynamoDB

First released in 2011, CloudFormation grew together with the portfolio of AWS Services, and now it supports almost the entire spectrum of technologies.

A full list of CloudFormation supported technologies can be found here.

As many pioneers, CloudFormation had to make all the mistakes and has never become a widely accepted technology, which can be proven by the mere fact of terraform/terragrunt and CDK existence. Bulky configs and castrated template languages it was based on made the wordy configs for a bit serious infrastructure practically unreadable. In the year 2019 AWS came up with CDK.

CDK: stepping back to the “programming”

Cloud Development Kit (v2), or CDK, takes the infrastructure as code approach to the whole level. Instead of drifting away from the actual programming with various tomls, YAML, jsons and more *mls, CDK takes infrastructure as code, sorry for the tautology.

Markup languages, no matter how well done they are, will not let the developer create complex structures typical for real programming languages: conditions, loops, functions and whatnot. And this is the actual purpose of their existence: to split the logic from configuration for example, giving operations engineers, not so proficient in hard core development and developers themselves to modify the software properties and behavior without need to reprogram the actual software.

The approach to use markup languages for infrastructure seemed only logical: why would we need a loop in docker compose config or CloudFormation template? That remained correct only for comparably simple topologies. As long as you imagine any vast corporate infrastructure, you would immediately need something like functions and modules to keep your code dry, loop over objects of similar nature, like lambdas, or set conditional statements for different environments of your distributed application. Many of the markup languages actually provide rudimentary analogs of those features, but none of them will come in comparison with the expressive power of a real programming language.

So CDK is the embodiment of that principle. With CloudFormation templates under the hood, CDK provides language native SDK for various platforms: JavaScript/Typescript, Java, Python, .NET and Go. The central principle, as in CloudFormation, is the Stack: a collection of objects and resources, associated with each other by purpose and usage patterns. Stacks are inherited from the official library’s Stack class. Here a real-world example, a lambda creation with an SQS trigger + Sentry.

Stack class in the Python library doesn’t have an implementation, the class declarations are only a facade to the real implementation in JavaScript over the jsii library, here is how the stack class header looks like:

@jsii.implements(ITaggable)
class Stack(
constructs.Construct,
metaclass=jsii.JSIIMeta,
jsii_type="aws-cdk-lib.Stack",
):
...

Using real programming comes with all its benefits, repetitive instructions can be wrapped in a function. And unit testing comes as a bonus: CDK renders the Stack to cloud formation template, against which you can run your assertions without deploying your stack to the real cloud. Quite neat ha? The assertions coming out of the box are quite basic, so here are a couple of examples and little util functions:

And here is the price: you can only run your application in the cloud. What to do?

The main SaaS deal

While using SaaS and deploying with CDK and similar services is low boilerplate solution in comparison to own clouds on Kubernetes, it comes with a big price (tag). Corporations like Amazon will do everything to enclose your solution within the Amazon universe, making a migration to another solution as hard as possible.

With a target platform like AWS, at the certain point you have to decide to give up local development, as rare AWS services can be run and tested locally, local development so natural for Python + Postgres + Redis stack becomes a real luxury. You would either need to maintain both docker compose and AWS setups, which is quire irrational, or to decide for one platform and test only there.

A remarkable solution to this duality is moto: a tremendous library mocking many of the AWS services in your python code, making unit tests for python code with vast AWS infrastructure a piece of cake. Moto creates an entire AWS cloud in memory, where you will be able to perform simple tests against your queues, databases and notifications, without having to use the real infrastructure.

Cloud-based testing has the following significant problems:

  • slow: a test suite of 200–300 tests will run up to 30 minutes on the cloud against 1–2 mins with moto
  • not isolated: if other developer starts the tests at the same time, your both tests most probably will fail
  • not repeatable: tests on real infrastructure will inevitably leave artifacts behind, if execution was incomplete and clean up scripts didn’t kick in

Following from the above: developers won’t execute the tests frequently enough and create new tests, because they fail randomly and take too long

The main price for moto is actual mocking: execution paths can be sometimes tricky, so finding why your beautiful mocked s3 is not working can take some time.

Mocking the entire cluster with moto

Among other services, moto supports CloudFormation, although not completely. Here is the list of supported features.

Usually with moto you would mock the service with pytest generator fixture like following:

@pytest.fixture(scope="session")
def s3_mock():
with mock_s3():
import boto3

yield boto3.client("s3", region_name="eu-central-1")

Mind the yield part, as if you want mocking to work properly, all other fixtures using this one have to be generators all up to the test.

First, you would need to create a CloudFormation template of your CDK:

cdk synth > template.yaml

This template contains all your cloud infrastructure, including the lambdas and code for it.

cdk init command creates a CloudFormation stack, that describes all the necessary infrastructure for the CDK to work, like s3 buckets to store the lambdas code. It’s a bootloader in its own form. I failed to make this template work, so here is the pytest fixture creating the necessary infrastructure manually and also fixing various errors I had to face while loading my current project’s CloudFormation template (lambdas, SQS triggers):

Various hacks were needed, like creating imaginary zips with code and so on. Moto was not tried in this scenario before. Besides, processing a middle-sized config took up to 20–30 seconds, which makes it impossible to use in unit tests. Intermediate integration testing is thinkable, though.

Some links to start with CDK and Python

Getting started: https://docs.aws.amazon.com/cdk/v2/guide/work-with-cdk-python.html

Tutorial: https://hands-on.cloud/how-to-use-aws-cdk-to-deploy-python-lambda-function/#source-code

=====================================

If you liked this, check out my recent articles:

🐘Django-plpy: Django toolkit for Python stored procedures in PostgreSQL

🕒 Why your software quality degrades with time: short story

😃 RapidAPI: first steps with Python

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store