Authors: Davide Taibi, Nabil El Ioini, Claus Pahl, Jan Raphael Schmid Niederkofler
In our recent scientific publication, we collected the serverless patterns proposed by practitioners on technical talks, blogs, and whitepapers. The goal is to supporting practitioners in understanding the different patterns, by classifying them and reporting possible benefits and issues.
We adopted a multivocal literature review process, surveying peer-reviewed and grey literature and classifying patterns (common solutions to solve common problems), together with benefits and issues.
Among 24 selected works, we identified 32 patterns that we classified as orchestration, aggregation, event-management, availability, communication, and authorization.
In this post, we summarize the patterns identified, together with their benefits and issues.
More information on the method adopted to summarize the patterns, and on the results can be found in our paper (Taibi2020)
The Patterns Proposed by Practitioners
We identified 32 patterns that we classified into five categories, namely:
- orchestration and aggregation
- event management
Orchestration and Aggregation
These patterns can be used to compose serverless functions or to orchestrate their execution creating more complex functions or microservices.
Aggregator [S12][S2][S1] (also known as “Durable functions” [S3])
Problem: Exposing a single endpoint for several APIs.
Solution: A function calls APIs separately, then aggregates the results and exposes them as a singular endpoint.
Data Lake [S4] [S5]
Problem: Keeping up with evolving requirements of data transformation and processing can be a hassle.
Solution: The data lake is a physical storage for raw data where data is processed and deleted the least possible. Organizing it with sensible metadata naming as times is a must for keeping order.
Benefits: The data remains always the same independently from the needs of the moment. It can be transformed just in time as necessary.
Fan-In/Fan-Out [S12][S8][S2][S4] (also known as “Virtual Actors” [S6], “Data transformation” [S7], “Processor” [S9], “Fire triggers and transformations” [S3])
Problem: Enable the execution of long tasks that exceed the maximum execution time (similar to Function Chain).
Solution: Split the work parallel tasks and aggregate the results in the end. The parallel execution leads to faster completion.
Issues: Strong coupling between the chained functions. As for function chain, splitting the tasks between functions can be complex [S10].
Function chain [S10][S3]:
Problem: Enable to execute long tasks that exceed the maximum execution time (e.g., longer than 15 minutes in Lambda).
Solution: Combine functions in a chain. An initial function starts the computation while keeping track of the remaining execution time. Before reaching the maximum execution time, the function invokes another function asynchronously, passing each parameter needed to continue the computation. The initial function can then terminated without affecting the next function in the chain.
Issues: Strong coupling between the chained functions, increased number of functions. Splitting the tasks be- tween functions can be complex [S10].
Proxy [S11] (also known as “Command pat- tern” [S13], “Anti-Corruption Layer” [S3])
Problem: Integration of functions with a legacy system.
Solution: Create a function that acts as a proxy for another service, handling any necessary protocol or data format translation.
Benefit: Clean and easy-to-access API for clients.
Queue-Based Load Leveling [S8][S2][S3] (also known as “The Scalable Webhook) [S12], “The Throttler” [S1])
Problem Building scalable webhooks with non-scalable back-ends. Webhooks enable to augment or alter the behavior of a web page, or web application, with custom callbacks.
Solution: Similar to the frugal consumer, queue service to trigger a function can be used, which allows queueing the requests under heavy load.
The Frugal Consumer [S12]
Problem: Increase scalability of non-scalable backends.
Solution: A function that processes the requests of multiple services (or functions) that post messages directly to a message queue.
The Internal API [S12]
Problem: Accessing microservices that are only accessed within the cloud infrastructure.
Solution: Leave the API Gateway and call the functions HTTP directly using an invocation type.
Benefits: Increased security as services are not accessible from outside.
The Robust API [S2][S12] (also known as “The Gateway” [S3])
Problem: Sometimes clients know which services in the back-end they want to use.
Solution: Use an API Gateway to grant access for clients to selected services. Benefits: Allows handling more individually clients. Issues: Increases complexity.
The Router [S12] (also known as “Routing Function” [S13], “Decoupled Messaging” [S2], “Data probing” [S9])
Problem: distribute the execution based on payload, without paying the extra cost of orchestration systems adopted in the state machine pattern.
Solution: Create a function that acts as a router, receiving the requests and invokes the related functions based on payload. Benefit: Easy imple- mentation.
Issues: the routing function needs to be maintained. Moreover, it can introduce performance bottlenecks and be a single point of failure. Double billing, since the routing function needs to wait until the target function terminates the execution.
Thick Client [S13]
Problem: Any intermediary layer between client and service increases costs and latency.
Solution: Allow clients to directly access services and orchestrate workflows.
Benefits: Increased performances, reduced cost at server-side, increased separation of concerns (Roberts, 2016).
The State Machine [S7] [S2] [S12]
Problem: Orchestration and coordination of functions.
Solution: Adoption of serverless orchestration systems such as AWS Step Functions, or IBM Composer to orchestrate complex tasks.
Issue: The complexity of the system increases, as well we the development effort.
These patterns help to solve communication problems.
Responsibility Segregation [S3]:
Problem: When the same functions are used for queries and data updates, it increases the risk of becoming inflexible.
Solution: Segregate functions that update and read from data sources. Use ”Commands and Queries” for the appropriate function to avoid this congestion.
Distributed Trigger [S12] (also known as ”Event Broadcast” [S13]): Problem: Coupling a message queue topic only with its own service.
Solution: Couple multiple services into a single notification function, possible via message queues. This setup works well if the topics have only a single purpose and don’t need any data outside of it’s micro-service.
Issues: The subscriptions to the queue topic remains the responsibility of the individual services.
FIFO [S12], [S13]
Problem: Create a FIFO Queue for serverless functions. Several messages do not work with a FIFO approach (first in, first out).
Solution: Use a crontab such as AWS Cloudwatch that periodically invokes the function asynchronously. Then, set the function’s concurrency to 1 so that there are no attempts to run competing requests in parallel. The function polls the queue for (up to 10) ordered messages and does whatever processing it needs to do. Once the processing is complete, the function re- moves the messages from the queue and then invokes itself again (asynchronously). This process will re- peat until all the items have been removed from the queue.
Benefits: Simple sequentialization.
Issues: Cascading effect [S12]: if the function is busy processing other messages, the cronjob will fail because of the concurrency setting. If the self-invocation is blocked, the retry will continue the cascade.
The Internal Hand-off [S12]
Problem: Use invocation Type (Event) for an asynchronous event.
Solution: Function stops automatically when the execution is finished and automatically retries when it needs to. Using a message queue to attach a Dead Letter Queue enables to capture failures.
Periodic Invoker [S7]
Problem: Execute tasks periodically.
Solution: Subscribe the function to a scheduler, such as AWS Cloud Watch, Google Cloud Scheduler, or Azure Scheduler.
Benefits: Run functions periodically without the need to keep them permanently alive.
Polling Event Processor [S11] (also known as “Polling consumer” [S22])
Problem: React to changes of states of external systems that do not publish events
Solution: Use the Periodic Invoker pattern to check the state of the service
Benefits: Run functions periodically without the need to keep a function permanently alive as a listener.
This group of patterns helps to solve availability problems, reducing the warm-up time, and possible failures.
Bulkhead [S20], [S3]
Problem: When a crucial, maybe load heavy, function fails, the complete system risks being compromised.
Solution: Partitioning workloads into different pools. These pools can be created on the base of consumer load or availability.
Benefits: This process isolates failure and reduces risks of a chain reaction of failures.
Circuit breaker [S12][S14]
Problem: Keeps track of failed or slow API calls.
Solution: When the number of failures reaches a certain threshold, ”open” the circuit sends errors back to the calling client immediately without even trying to call the API. After a short timeout, the system “half open” the circuit, sending just a few requests through to see if the API is finally responding correctly. All other requests re- ceive an error. If the sample requests are successful, the system “close” the circuit and start letting all traffic through. However, if some or all of those requests fail, the circuit is opened again.
Benefits: Cost saving for synchronous requests.
Compiled Functions [S15]
Problem: Serverless cloud computing would be a perfect fit for IoT especially at Edge would it not be so heavy-weight in memory footprint and invocation time.
Solution: A high level specialized ahead of time compiled Serverless language can reduce the memory footprint and invocation time. This might make Edge technology in the cloud viable.
Function warmer[S12][S16][S23] (also known as ”Function Pinging” [S10], “Warmer service” [S9], “Cold Start” [S21], “Keeping Functions Warm” [S8], “keep-alive” [S3])
Problem: Reduction of cold start time, the delay between the execution of a function after someone invokes it. Serverless functions are executed in containers that encapsulate and execute them. When they are invoked, the container keeps on running only for a certain time period after the execution of the function (warm) and if another request comes in before the shutdown, the request is served instantaneously. Cold start takes between 1 and 3 sec- onds [S2][S23]. For example, AWS (Shilkov, 19 b) and Azure (Shilkov., 19 a) recycle idle function in- stances after a fixed period of 10 and 20 minutes re- spectively.
Solution: Ping the function periodically to keep it warm.
Benefit: Reduction of response times from 3 seconds to 200 milliseconds. Issues: In- creased cost, even if limited to only one call every 10–15 minutes.
Oversized Function [S10]
Problem: In Server- less it is not possible to choose on which CPU runs. Solution: Asking for bigger memory grants a faster virtual machine too, even if no more memory is required.
Read-heavy report engine [S12][S2]
Problem: Overcome the limits of downstream limits of read- intensive applications.
Solution: Usage of data caches and the creation of specialized views of the data most frequently queried
Benefits Increased performances.
The Eventually Consistent [S12]
Problem: Replicate data between services to keep them con- sistent.
Solution: Use database stream services (e.g. DynamoDB stream) to trigger events made on the database from previous functions and use the data again for whatever needed.
Problem: The timeout time for API Gateway is 29 seconds. Which is a long time for a user and makes for a bad experience with the service.
Solution: Reduce the timeout to a shorter span, preferably around 3–6 seconds.
Here, we describe patterns to communicate between functions.
Data Streaming (also known as ”Stream and pipeline” [S2], ”I am a streamer” [S12] and ”Event Processor” [S13][S22], “Streaming Data Ingestion” [S4], “Stream processing” [S3])
Problem: Manage continuous stream of data.
Solution: Server- less platforms offer possibilities like Kinesis(AWS) to handle and distribute large streams of data to services.
Issues:Data streams can be expensive in Serverless. Working outside the platforms eco system can be dif- ficult too.
Externalized state [S10] [S3] (also known as “Share State” [S21])
Problem: In some cases, it is needed to share the state between functions. Solution: Share the state saving it into an external database.
Issues: High coupling between the functions, latency overhead [S10], additional programming effort.
Solution: ”Apply a continuous stream processor that cap- tures large volumes of events or data, and distributes them to different services or data stores as fast as they come.” [S2] Romero [S2] and Dali [S12] propose an AWS specific example, using the AWS API Gateway as a Kinesis proxy5. In this way, it is possible to use any number of services to pipe data to a Kinesis stream. Finally, Kinesis can be used to aggregate the results.
Publish/Subscribe [S2][S3][S18] (also known as “The Notifier”[S12]
Problem: Forward data for in- ternal services (or APIs).
Solution: Use a standalone topic in the message queue to distribute internal notifications for internal services.
These deal with user authorization problems.
The Gatekeeper [S3][S12]
Problem: Authorize Functions.
Solution: Use a Gateway to create an authorizer function that processes the authorization header and returns the authorization policy.
Valet key [S19]
Problem: Authorization without routing all the traffic through a gatekeeper process.
Solution: By requesting first access from a special authorizer serverless function it is granted a token which is valid for a certain period of time and access rights.
Trends and Open Issues
Serverless moves us towards continuous development and delivery. An important observation is that, dif- ferently than in Microservices, serverless-based ap- plications do not require to develop a full application stack. That means that their infrastructure resources like data stores and networks don’t need to be man- aged, as they are under the responsibility of the cloud provider. Resulting from our study, but also discus- sions in Leitner et al. [S10], we can identify the fol- lowing emerging issues:
- Comparison between microservices and serverless functions. A microservice can be composed of one or more serverless functions. However, how to combine functions into a complete application or into a microservice is still not clear. Researchers might support practitioners by proposing the adoption of previously developed techniques for aggregating distributed systems or basic software engineering techniques.
- Lack of stable tools. The actual state of tools for supporting serverless development is still limited. Different tools are continuously proposed on the market increasing the complexity of decisions for long-term development.
- Reuse of Functions. What happens once a growing system has thousands or millions of functions is still not clear. Will it be possible to have a good system understandability with such a complex system? Grouping functions into isolated mi- croservices might help, but at the moment it is still not clear how to proceed.
Negative Results. In which contexts do serverless, and in particular some specific patterns turn out to be counterproductive? Are there anti-patterns? All the aforementioned points require more experience reports and empirical investigations.
This post summarizes the results published in our scientific paper (link to download):
Taibi, D., N. El Ioini, Pahl, C., J.R. Schmid Niederkofler (2020). Patterns for Serverless Functions (Function-as-a-Service): A Multivocal Literature Review. Proceedings of the 10th International Conference on Cloud and Service Science. 2020
[S1] Baldini, I., Castro, P., Chang, K., Cheng, P., Fink, S., Ishakian, V., Mitchell, N., Muthusamy, V., Rabbah, R., Slominski, A., and Suter, P. (2017). Serverless Com- puting: Current Trends and Open Problems.
[S2] Romero, E. (2019). Server- less microservice patterns for aws. https://medium:com/@eduardoromero/serverless-architectural-patterns-261d8743020/.
[S3] Likness, J. (2018). Serverless apps: Architecture, pat- terns, and Azure implementation. Microsoft Developer Division, .NET, and Visual Studio product teams.
[S4] Serverless architectural patterns and best prac- tices (arc305-r2) — aws re:invent 2018. https://www.slideshare.net/AmazonWebServices/serverless- architectural-patterns-and-best-practices-arc305r2- aws-reinvent-2018.
[S5] Benghiat, G. (2017). The data lake is a design pattern. https://medium.com/data-ops/the-data-lake-is-a- design-pattern-888323323c66/.
[S6] Bernstein, P. A., Porter, T., Potharaju, R., Tomsic, A. Z., Venkataraman, S., and Wu, W. (2019). Serverless event-stream processing over virtual actors. In CIDR.
[S7] Hong, S., Srivastava, A., Shambrook, W., and Dumi- tras, T.(2018). Go serverless: securing cloud via server- less design patterns. In 10th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud 18).
[S8] Zambrano, B. (2018). Serverless Design Patterns and Best Practices: Build, secure, and deploy enterprise ready serverless applications with AWS to improve de- veloper productivity. Packt Publishing Ltd.
[S9] Shafiei, H., Khonsari, A., and Mousavi, P. (2020). Serverless computing: A survey of opportunities, chal- lenges and applications.
[S10] Leitner, P., Wittern, E., Spillner, J., and Hummer, W. (2019). A mixed-method empirical study of function- as-a-service software development in industrial prac- tice. Journal of Systems and Software, 149:340–359.
[S11] Pekkala, A. (2019). Migrating a web application to serverless architecture. Master’s Thesis in Information Technology, University of Jyva¨skyla¨.
[S12] Daly, J. (2019). Serverless microservice patterns for aws. https://www.jeremydaly.com/serverless- microservice-patterns-for-aws.
[S13] Sbarski, P. (2017). Serverless Architectures on AWS . Manning.
[S14] Nygard, M. T. (2007). Release It!: Design and Deploy Production-Ready Software (Pragmatic Programmers). 1 edition.
[S15] Gadepalli, P. K., Peach, G., Cherkasova, L., Aitken, R., and Parmer, G. Challenges and opportunities for effi- cient serverless computing at the edge.
[S16] Bardsley, D., Ryan, L. M., and Howard, J. (2018). Serverless performance and optimization strategies. 2018 IEEE International Conference on Smart Cloud (SmartCloud).
[S17] Lumigo (2019). Aws lambda timeout best practices. https://lumigo:io/blog/aws-lambdatimeout-best- practices/.
[S18] Pirtle, J. (2019). 10 things serverless architects should know. https://aws:amazon:com/blogs/architecture/ten- things-serverlessarchitects-should-know/.
[S19] Adzic, G. and Chatley, R. (2017). Serverless computing: Economic and architectural impact. In Joint Meeting on Foundations of Software Engineering , ESEC/FSE 2017, page 884–889.
[S20] AWS, A. (2018). Serverless application lens aws. https://d1.awsstatic.com/whitepapers/architecture/AWS- Serverless-Applications-Lens.pdf.
[S21] Group, I. S. (2019). Aws lambda serverless coding best practices.https://www:intentsg:com/awslambda- serverless-coding-best-practices/.
[S22] Gregor, H. and Woolf, B. (2004). Enterprise integration patterns: Designing, building, and deploying messag- ing solutions. 1 edition.
[S23] Bhojwani, R. (2019). Aws lambda timeout best practices. https://lumigo:io/blog/aws-lambdatimeout-best- practices/.
[S24] Rabbah, R., Mitchell, N. M., Fink, S., and Tardieu, O. L.(2019). Serverless composition of functions into ap- plications. US Patent App. 15/725,756