Protecting Your Serverless Solution

Published in

Version 1

13 min readJul 20, 2020

In recent years, the growth of Serverless has been monumental. More and more organisations are realising the benefits of the technology of not having to manage the underlying infrastructure and being able to scale on demand. Much of the growth has been haphazard and without a strategy which has lead to security being much of an afterthought.

This article will describe how an organisation can protect its Serverless solution. It will focus on solutions deployed in AWS with its Serverless offering, Lambda. Nevertheless, many of the principles can be applied to all cloud platforms.

Why are Serverless Solutions Insecure?

Serverless technology enforces the use of microservice design patterns. This results in numerous, small services performing unique functionality. With many services, the organisation can develop Serverless Sprawl and can lead to unknown, unmanaged and insecure Serverless solutions.

The Components of Serverless

At the core of all Serverless solutions will be a form of programming language or code. Any solution will require various touchpoints and integrations such as data stores and orchestration technologies. These components will form part of the entire solution and therefore the security of each of them is just as important as the code itself.

Various components of Serverless Solutions.

Cost-Benefit Analysis

The first step in securing any solution is to perform a Cost-Benefit Analysis. This involves identifying the components of the Serverless solution that is being considered. A risk assessment is then performed. The impact and the likelihood of the loss of part or the entire solution must be determined using both a quantitative and qualitative approach. The resulting risk assessment will determine how important the solution is to the organisation. It will assist in the understanding of the approach to risk that will be taken; whether the risk should be mitigated, accepted, avoided or transferred. The total cost of security controls implemented should always be less than the cost of the loss of the system.

Serverless Pricing

To secure the solution, the pricing of Serverless needs be considered. There are 2 types of pricing when working with Serverless: Direct and Indirect.

Direct Pricing

This is pricing that applies to the Serverless function; the code.

There is a generous free tier that some organisations take advantage of and never exceed. After the free tier is exhausted the following pricing elements will affect how much an organisation has to pay:

Requests — the number of times the Serverless function is invoked
Memory — the memory consumed by the Serverless function
Duration — the amount of time taken by the execution of the Serverless function
Provisioned Concurrency — the expected use of the function. This helps to reduce Serverless function invocation latency

Indirect Pricing

As mentioned earlier, the Serverless function itself will interact with a number of other components to form part of the overall solution. There will be a cost associated with the various components that are used by the solution. For example, if S3 is used, then the cost of the amount of storage and the cost of data requests will have to be accounted for.

One of the often-forgotten aspects of any solution is data transfer costs. Any data that is transferred out of the AWS region will have to be included. This is often the source of surprise if it is not initially factored into the pricing of the solution.

The Shared Responsibility Model

The benefit of using the cloud is that the underlying platform that services run on are managed by the cloud provider. Part of a cloud solution will be managed by the cloud provider and part of it will be managed by the cloud customer. The level of responsibility will depend on the service that is being used.

With Serverless, the cloud provider manages the platform that the cloud customer’s code will run on. Feature and security updates to the platform are carried out by the cloud provider. The configuration and code of the solution will lie with the cloud customer.

Organisations adopting Serverless solutions need to understand where their responsibility starts and where it stops. This will provide detailed knowledge as to where security controls need to be deployed.

Know Normal

Several questions should be asked regarding the solution when securing it:

What is it trying to achieve for the business?
How is the solution technically implemented?
How is it orchestrated?
What are the touch and integration points?
How is it operationally maintained?
Which users and services have access?
What are the recurring issues?
How are errors handled?
How is it monitored and what alerts are in place?

Before implementing any security controls, it is vital to have a grasp on the important metrics and the key performance indicators (KPIs). To understand what the performance is measured on, how often is it used, concurrency and loads at various times during the day is fundamental. Having these metrics enables the organisation to know what the normal behaviour of the solution is. Any metrics that fall outside of the expected range could indicate security incidents.

Protecting the Code

Serverless solutions are primarily constituted of programming code. A Source Code Manager (SCM) such as GitHub or AWS CodeCommit is typically used to store and manage the code. An SCM provides accountability by recording information of any changes that are made to the code. The SCM is typically provided as a Software-as-a-Service (SaaS) service by the SCM provider. An option to host and manage it internally is usually available for those that prefer to retain the code internally and not in the cloud.

A minimum set of privileges should be provided to the users or processes for them to be able to complete their task. This is referred to as the Principle of Least Privilege (POLP). If a user only needs read-only access to the code, then this is the only permission that should be provided.

Multi-factor authentication (MFA) should be enabled to provide the additional layer of security when authenticating to the SCM.

When working with the code, SCMs will offer the use of SSH or HTTPS to clone the repository. Both of these are secure transport methods used to transfer the code.

The key branches such as the master or some feature branches need to be protected. Engineers should not be able to push directly to these branches. They should create develop branches which should be reviewed by a pull request before it is merged into one of the key branches.

A consistent naming convention should be applied to function names and commit messages. It is good practice to include a task reference number. This then can then tie back to the project tracking system to understand the context and full description of any changes to the code.

A virtual desktop environment, such as AWS Workspaces, in the cloud, can be provisioned for each engineer. These can be either a Windows or Linux operating system. All development can be carried out in the cloud. The code therefore never resides on the engineer’s laptop or device. Should the device get stolen or misplaced, the code will not be lost.

If mobile devices are used to locally work with code, then a baseline set of requirements should be verified on the device such as antivirus or encryption on drives. Mobile data management (MDM) or mobile application management (MAM) solutions should be considered which can remotely wipe data in the event of a device loss.

Identity Access Management (IAM) and Roles

IAM controls access to resources and the AWS console. Access to the resources should be restricted using POLP. Ideally, the Lambda console should be read only for viewing purposes and all functions should be updated from an automated pipeline.

AWS IAM Access Analyser — is a useful service when working with AWS. It provides detailed information on when resources are accessed and by whom.

Roles allow AWS services to be able to interact. For example, the role attached to Lambda would need to be able to write to CloudWatch to create logs. It is tempting during development to create a single role which is able to interact with any service and attach that role to every function. This, however, causes a security concern and violates the POLP. The recommended practice is to create a single role which contains access to only the services is requires and associate that to a single function. Each function should have its own role.

Network

By default, Lambda will have access to the internet. It is possible to allow Lambda to access resources within a VPC. The function will then not have access to the internet. To allow it to access the internet, a Network Address Translation (NAT) solution can be implemented which is the pattern that would be followed when resources in private subnets require external access.

Secrets

Credentials should never be specified or hardcoded into the code or environment configuration. They become visible and easily available. Secrets should be stored centrally where they can be managed and rotated. A secrets manager provides secrets dynamically at runtime. AWS has two options which can be used:

AWS SSM Parameter Store — basic secrets management
AWS Secrets Manager — feature-rich secrets management

Access should be restricted to the secret manager console itself using IAM.

Function Configuration

Some of the configuration properties of Lambda are

Concurrency — the number of invocations of the function that can occur simultaneously
Throttling — the errors that are generated when concurrency limits are reached
Timeout — the duration that the function can run for

These need to be set to appropriate values based on metrics and KPIs. Any issues with the above may be caused by a security issue and therefore should be investigated.

Data

Lambda can write to any data store type. Data at rest should be encrypted. Most data stores tend to be encrypted by default or it is as simple as ticking a flag to enable basic encryption. Access to encryption keys should be restricted and they should be rotated

The residency of the data should be according to the local laws and regulations. There may be a requirement that ensures that data does not leave the region where it is located. This can have an impact on disaster recovery plans which may involve relocation to an alternative region and potentially violate the laws and regulations.

Retention and archiving policies should be set for the data in order to comply with the laws and regulations. Organisations should only retain data as long as the laws allow or are required to do so.

API Gateway

Most solutions require interaction with multiple APIs. Use of an API Gateway provides a central management service for internal and external APIs. The AWS API Gateway has many security controls:

· Gateway Resource Policies and IAM — can be used to control access and invocation of APIs

· API Keys — used to ensure that consumers provide expected keys, to control throttling and manage the monitoring of APIs

· Lambda Authorisers and AWS Cognito — provide mechanisms to control authorisation to the API

· Throttling and Caching — control the number of invocations of the API and caching which is useful when the APIs are under heavy loads such as being under DDoS attack.

AWS Organizations

AWS encourages the creation of multiple accounts to segregate departments and capabilities. With AWS Organizations a hierarchical account structure sitting underneath a master account can be created. Service Control Policies (SCPs) can be used to allow or deny sub-accounts the use of particular services.

Consolidated Billing allows for billing and costs to be shown per individual account provided to the master account. Combining that with enforced tagging of resources can lead to the benefit of having a granular view of the cost of the resources that are being used.

Serverless Development

Serverless can be developed using several methods. The traditional model would be for developers to work locally and to push the code up to AWS through the Lambda console.

More modern approaches would be use containers that are enclosed so can include all required packages and libraries. This makes it easier to keep all the requirements for the function in a single place.

Cloud Integrated Development Environments (IDEs) such as AWS Cloud9 are rising in popularity. They, like virtual desktop environments, allow for all development to be carried out and remain in the cloud.

Secure development practices should be followed and could be specific to the programming language being used. The Open Web Security Project (OWASP) provides a useful set of resources and recommended practices on how to secure web applications. They contain a list of high priority security risks that anyone involved in a Serverless solution should familiarise themselves with.

The OWASP Top 10 (2017) of critical risks are:
1. Injection
2. Broken Authentication
3. Sensitive Data Exposure
4. XML External Entities (XXE)
5. Broken Access Control
6. Security Misconfiguration
7. Cross-Site Scripting (XSS)
8. Insecure Deserialization
9. Using Components with Known Vulnerabilities
10. Insufficient Logging & Monitoring

DevOps / DevSecOps

Adopting a Development Operations (DevOps) or Development Security Operations (DevSecOps) culture in the organisation results in breaking down of barriers between capabilities and results in rapid collaboration. It involves the use of automation tools such as AWS CodeCommit, AWS CodeDeploy and AWS CodePipeline along with third-party tools.

Automation of code build, test and deploy activities provide numerous benefits particularly with the ability to audit changes, provide rapid changes and remediation of security issues. The Security capability of an organisation can be integrated into the teams and the automation pipeline. Approval of various stages should involve the key reviewers along with the Security team.

Testing

Various forms of testing can be integrated into the deployment pipelines. Static Testing is the review of the code in its non-running, uncompiled state. This can be either done manually using code reviews, pull requests or tools such as AWS CodeGuru and SonarQube for security and performance-related testing.

Dynamic Testing is the testing of code when the solution is in a running state. There are various solutions on the AWS Marketplace to support dynamic testing and solutions such as Selenium and Apache Bench for security and performance testing.

Penetration Testing involves hiring a third party in an attempt to find vulnerabilities in an organisation’s systems by putting themselves in the mindset of a malicious actor. This can be done as part of the organisation’s wider security program or as a focus on the Serverless solution.

The Well-Architected Framework — Serverless Lens

AWS provides the Well-Architected Framework that contains the pillars of cost optimisation, operational excellence, reliability, performance efficiency and security. They provide best practices in these areas.

There is a Serverless Lens that can be followed to maximise the benefits of Serverless along with securing the solution.

Monitoring, Troubleshooting and Alerting

Many services are available to monitor and investigate security incidents in AWS. A combination of tools should be used and integrated with the Security and Information Events Management (SIEM) service. These should be made available to the Operations and Security teams of the organisation.

AWS CloudWatch — used to capture logs for the solution
AWS CloudTrail — an audit trail capturing AWS API calls
AWS Config — captures changes to resources and along with tagging and Lambda can implement remediation
AWS X-Ray — allows to trace solution flows and understand where issues lie along with any latency
AWS SNS — used for alerts based on configured thresholds
AWS Security Hub — a SIEM providing dashboards that reflect the security posture of a solution
AWS Detective — providing a visual history of incidents allowing for further investigation

There are also many specialised third party products that offer the services listed above. Splunk is an example of a popular monitoring, logging and SIEM service.

Incident Response

The security view of any organisation should be to expect a security incident to occur. A security policy, business continuity and disaster recovery plan should be created, reviewed and maintained.

Application metrics should be monitored to understand normal behaviour. Anomalies to the baseline can identify a security incident occurring. Implementing alerts based on unexpected behaviour of the application can allow the Operations or Security team to investigate and respond.

Rehearsals to common incidents should be carried out frequently. Some rehearsals may require a full interruption test which disrupts the organisation. With the benefits of the cloud and the ability to provision infrastructure instantaneously, parallel tests can be carried out by deploying temporary solutions without affecting the organisation.

You’re Never Finished

Serverless solutions have outstanding benefits for organisations that do not want to manage the underlying platform. Changes in technologies and local regulations may lead to changes in the security of the Serverless solution. Implementing security controls is never a one-time job. The threat landscape is large and always growing. New methods are continuously being developed by malicious actors to disrupt and penetrate systems. It’s crucial that organisations strive to stay ahead of threats and review their security posture regularly. Security is an eternal process and the work is never done.

Read more: Why You Must Modernise Your Applications With AWS

About the Author

Sat Gainda is a Cloud Solutions Architect at Version 1, working on enterprise-level engagements that utilise innovative Cloud systems. Stay tuned to Version 1 on Medium for more Cloud-focused posts from Sat.