Serverless + Evolutionary Architectures + Safe Deployments = Speed in the Right Direction

11 min readSep 3, 2018

With Serverless Computing, you can build and run applications without thinking about servers. To me, the greatest advantage is the capacity to focus on what you want to build, instead of the nuts & bolts required by the implementation of the solution.

“IT is no longer holding back the business. In fact, it’s helping us grow faster” as described in the PhotoVogue case study.

By accelerating your pace of development you need to be more careful in holding the right direction and defining clearly where you want to “go”. In fact, a best practice is to think of your development not as a project (with a clear start and end date), but as a product, used by customers if they find it useful.

“Lots of teams today struggle with scheduling something usually called technical work: system improvements that the team finds important, but nobody actually asked for them. Do too little technical work, and you’re actively damaging the product by slowly turning it into an unmaintainable mess. Do too much of it, and you’re actively damaging the product by delaying important business features.”
— Gojko Adzic, Sprints, marathons and root canals

There are different approaches that can help to keep the right steering for your product. Here, I’d like to focus on Evolutionary Architectures. For this article, I was inspired by Michael Wittig talk at JeffConf Hamburg.

What are Evolutionary Architectures?

Photo by Ricardo Gomez Angel on Unsplash

To me, the term “architecture” applied to software development brings the idea of static, difficult to change, solutions. Unfortunately, that is often the result of a software implementation.

An evolutionary architecture designs for incremental change in an architecture as a first principle.

The idea is to look at the lifecycle of a software product as an optimization task. When you add a new feature, refactor some code, improve the security or the scalability of our solution, you are actually optimizing it towards some specific goals set for the product. But how can you be sure of doing the right thing? And how can you define the right priorities between different functional and non-functional requirements that need to be implemented? Should you focus on scalability or availability first?

If you look at this problem as an optimization task, you can think of the search for a solution as the exploration of the space of all possible solutions. How can you find the best one? What can you measure? When the space to explore is too vast, you can use the same approach of evolutionary algorithms, that look at how natural evolution is optimizing biological populations.

The core concept of evolutionary algorithms is to create a “population” of possible solutions and evaluate them using a fitness function that estimates how much those solutions can solve a specific problem.

Using these estimations, you can keep a few of the best “candidate solutions” from the population, and generate new ones in a tentative to efficiently explore the vast space of all possible solutions. This process of generation and selection can be repeated until you find a “good” solution for your problem.

In the case of software development, the space of all possible solutions is actually all possible computer programs, software architectures and tools you can use, and it is quite difficult to generate new solutions “automatically” in an efficient way. But you can still keep the idea of a fitness function to evaluate:

how good the current implementation is
how much it would improve if you implement a specific functional or non-functional requirement

The actual absolute value of a fitness function has no meaning. It is the relative difference, comparing two evaluations, that tells you which one has to be preferred.

The fitness of an application should grow over time, with the release of new versions, by adding specific support for non-functional requirements such as scalability, availability, and security.

Using evolutionary architectures, you define a fitness function that estimates how “good” an implementation is now, or much it can improve if you implement specific changes. This fitness function can be used to drive and prioritize future developments.

A fitness function can monitor:

source code metrics (such as measuring cyclomatic complexity)
unit tests (% of coverage and % of success)
integration test (% of success over a set of synthetic transactions)
performance metrics (such as API latency or throughput)
security (encryption at rest, for example checking that all S3 buckets and DynamoDB tables have encryption enabled, or automatic key rotation for all external APIs, using tools such as the AWS Secrets Manager)

If you are in doubt of where to focus, a well-defined fitness function can tell you which change would give the biggest increase.

There is a similar concept in the field of machine learning: the objective function, used during the training of a model. In that case, there are two main components:

a loss function, that measures how predictive our model is on your data
a regularization term, that measures the complexity of the model

In machine learning, to have a successful model, you try to minimize their sum. In the case of the fitness function, you want to maximize its value, but you can switch from minimizing to maximizing by changing the sign of the function.

The idea of having a “regularization” term for software architectures is a good practice, to avoid adding complexity in your solution unless there is a measurable return.

For example, a regularization term for a fitness function can reduce the score for every new “component” you have to manage (install, patch, scale, …), because every component is introducing its own non-functional requirements.

Why Safe Deployments?

On the other side, it is not just important to evolve a software product in the right direction, but also to have the latest version in production quickly and safely. In this way, you can get a feedback on how the new version is working.

Some feedbacks will be technical, for example:

Is the latency ok?
Is the user interface working on all devices?

Other feedback will be more practical and business oriented:

Are the new features you just implemented used by customers?
Can you find a way to measure the improvement for your users?

The overall duration of this feedback loop is probably the best indicator of the speed of development.

Being able to deploy quickly and safely in production, and roll back to the previous version automatically in case you need it, is something that you need to be confident in deploying often.

To make deployment easier, last year at re:Invent AWS introduced a few features that can help here:

Using those features, you can:

Define serverless applications in a template (a text file), using a simple and clean YAML-based syntax
Implement what described in the template in a stack, and you can have multiple stacks for different environemnts, such as test, user acceptance, and production.
Choose globally, for all functions, or a function level, a deployment strategy, such as canary or linear deployments
Monitor deployment using alarms and hooks that can automatically rollback when needed

In this context, alarms are actually a list of CloudWatch alarms that can be triggered by standard (provided by AWS) or custom (uploaded by you) CloudWatch metrics, monitored during deployment.

By hooks here I mean two validating Lambda functions that are run respectively before and after traffic shifting:

For example, with a PreTraffic hook, you can run integration tests against the newly created Lambda version (not serving traffic yet)
Using a PostTraffic hook, you can run end-to-end validation checks, including results from monitoring during deployment, such as the latency of your function as your workload is moved to the new version

Safe Deployment flow, each step can rollback to the previous state

AWS CodeDeploy invokes hook functions asynchronously. The functions receive in input the information to report success, and move forward in the deployment, or failure, to roll back.

In this way, you are not limited by the maximum execution duration of the hook function. You can implement checks or activities that can run for several minutes or hours, for example using AWS Step Functions. You can then complete the hook by calling the CodeDeploy putLifecycleEventHookExecutionStatus API.

Example of a CodeDeploy Serverless Deployment with a PreTraffic Lambda function.

I suggest you add a custom CloudWatch metric to monitor your key business metric and see how it is affected by a deployment. If there is a change that is not expected, you can roll back and understand what was happening by looking at logs and tracing information.

For example, Netflix is looking at the number of “play” per second to understand if something important has been broken by a deployment.

Putting It All Together

Fitness functions can be fully, or partially, automated. So how can you implement a fitness function monitoring your serverless application?

The hook functions are very good candidates. They can be used as two triggered fitness functions:

the PreTraffic function can check what is measurable before the deployment (such as functional requirements, or security configurations)
the PostTraffic function can estimate the impact of the deployment (performance, user satisfaction, or other business metrics)

The CloudFormation stack is updating and giving the links to the deployment of the updated Lambda functions using CodeDeploy

Those two functions can post their own fitness estimates as CloudWatch custom metrics. The application itself can post other estimations as custom metrics (for example, how much recently introduced features are used).

CloudWatch metric math can then be used to build an overall “fitness metric”, for example as a weighted addition of the fitness functions posted by deployment hooks or by the application.

The overall “fitness metric” can be monitored, deployment after deployment, and visualized using, for example, CloudWatch Dashboards.

Sample dashboard showing how fitness is changing over time, after multiple deployments. In this test case, fitness is going up an down, something you may want to avoid in production.

A Sample Implementation

To make it easier to try this approach, I created a sample implementation that you can use as a starting point for your development:

danilop/evolutionary-serverless-architectures-with-safe-deployments

A sample implementation of an evolutionary architecture for a serverless application using safe deployments…

github.com

To build it, I started with this sample code in the AWS SAM repository:

awslabs/serverless-application-model

AWS Serverless Application Model (AWS SAM) prescribes rules for expressing Serverless applications on AWS. …

github.com

I updated the Node.js runtime to version 8.10 to make use of the new async/await syntax.

The AWS SAM template.yaml creates:

two S3 buckets (one with encryption at rest, one not)
two DynamoDB tables (one with encryption at rest, one not)
two Lambda functions (myFirstFunction and mySecondFunction) that implement a basic API (using the Amazon API Gateway)
a preTrafficHook Lambda function that is used to measure the fitness of the architecture and posts the result as a CloudWatch metric that you can monitor, alarm or visualize in a dashboard

To test the deployment, you can use the following AWS CLI CloudFormation package/deploy commands two times:

the first time to create the CloudFormation stack for the application, as described above
the second time to update the stack, see how safe deployments work, and how the fitness of the architecture is measured by the PreTraffic function

You can follow the first implementation of the stack, and the next updates, from the CloudFormation console. The previous commands use the default region set for the AWS CLI.

Optionally, you can use the new AWS SAM CLI to validate the template and package/deploy the application. The SAM CLI adds lots of features that I am not using in this article, such as starting code for mulitple programming languages, quick access to logs, local emulation, and generation of sample events.

For the two Lambda functions, I used different deployment strategies:

myFirstFunction is using a Linear deployment, adding 10% of the invocations to the new version every minute, completing the deployment in 10 minutes (Linear10PercentEvery1Minute)
mySecondFunction is using a Canary deployment, with 10% of the invocations to the new version for 5 minutes, and then a rollout to 100%, completing the deployment in 5 minutes (Canary10Percent5Minutes)

In order to have a new deployment of these two Lambda functions, you need to change something in the code in the src/ folder (or at least save one of the source files again, so that there is a different timestamp).

The preTrafficHook function is running some tests to check if the deployment must Succeed or Fail and at the same time is computing the value of the fitness function for this deployment:

some of the tests are actually atomic fitness functions themselves, testing a single resource in the CloudFormation stack
other tests can act as holistic functions, testing that multiple resources (such as functions and databases) are working together in an expected way

To simplify and reuse atomic tests on single resources, the SAM template is passing the CloudFormationStackId to the preTrafficHookfunction as an environment variable.

Return the list of resources in a CloudFormation Stack

Using the StackId, the function is getting the list of the resources in the stack, on which it can iterate and apply specific tests depending on the resource type. For example, you can have a check list for S3 buckets, and another check list for DynamoDB tables. In this way, there is no need to customize the preTraffic function when a new resource is added.

Run tests on all CloudFormation resources in the stack

Most of the tests involve invocations to AWS services. To reduce the overall duration of this function, those invocations are executed concurrently:

all tests are implemented as async functions (so that are automatically wrapped as promises)
all tests are added to a list (array) that is then executed using Promises.all()

Some of the tests can be implemented on non-functional requirements, such as security and scalability, for example:

check that encryption at rest is enabled on all S3 buckets
check that versioning is enabled on all S3 buckets
check that encryption at rest is enabled on all DynamoDB tables
check that public write and/or read is prohibited for all S3 buckets
check that S3 buckets accept HTTPS requests only
check that auto scaling is enabled for all DynamoDB tables

Those checks contribute to the measurement of the fitness function. If you change your architecture (and your application) to be more secure or scalable, you automatically increase the resulting fitness.

Instead of implementing all tests, you can use AWS Config and leverage the existing AWS Config managed rules, such as:

s3-bucket-logging-enabled
s3-bucket-replication-enabled
s3-bucket-versioning-enabled
s3-bucket-public-write-prohibited
s3-bucket-public-read-prohibited
s3-bucket-ssl-requests-only
s3-bucket-server-side-encryption-enabled
dynamodb-autoscaling-enabled
dynamodb-throughput-limit-check
lambda-function-public-access-prohibited
lambda-function-settings-check

A full list of AWS Config managed rules is available here. To check the compliance to one or more of those rules, I am using the AWS Config getComplianceDetailsByResource API.

Conclusion

Software architectures are dynamic environments and fitness functions can be used to monitor new releases, compare the results of different changes and plan for future developments.

With serverless architectures, safe deployments can be used to automate fitness functions and further reduce the duration of the feedback loop, improving speed and agility of a development team.

More where this came from

This story is published in Noteworthy, where thousands come every day to learn about the people & ideas shaping the products we love.

Follow our publication to see more product & design stories featured by the Journal team.