By grouping them into services based on well-defined patterns and implementing some reasonable optimizations
If you are new to Function-as-a-Service then you have legitimate questions. Functions have changed all the rules and you are trying to sort them out. One of those questions is: How do you manage so many small functions? I have been wrangling with these sorts of questions since shortly after AWS Lambda was introduced. Here is what I have found that works well.
For starters, I don’t manage functions individually, I group them into services. Each service is managed by a CloudFormation stack using the Serverless Framework. Here are some stats from a recent project. The system had a little over 100 functions. Managing this many individual functions would be a headache. Fortunately the system only had just over 30 stacks (i.e. services) with 1–4 functions per service. That sounds a lot better! The functions follow a naming standard that starts with the service name, which makes it easy to find related functions in the console. CloudFormation handles the life cycle of the functions and cleans things up so that there are no orphans when you delete a stack. Each service has its own Git repository and CI/CD pipeline. Here are some examples from my cookbook with GitLab-CI and Bitbucket-Pipeline yamls.
Grouping functions into services and having a completely automate CI/CD pipeline for each service is a big win. But at a glance I should also know what I expect to find inside each service. To this end, each service follows a well-defined pattern. In my books I categorize these as boundary patterns or control patterns. The boundary services interact with people or external systems and the control services handle the flow of inter-service collaboration. These patterns help team members move easily between services, facilitate governance and simplify planning. Here are some stats for these different flavors of services.
Backend For Frontend (BFF) services support the end users. These typically account for about 40% of the services and usually have 3 functions, ± 1. The graphql function provides the queries and mutations (i.e. commands) that are needed to support the specific user interface. The listener function consumes events from the stream and creates the materialized views used by the queries. The trigger function reacts to the mutations and produces new events to the stream.
External Service Gateway (ESG) services provide an anti-corruption layer that encapsulate the details of interacting with other systems, such as 3rd party, legacy and sister systems. They act as a bridge to exchange events between the systems. These often account for about 50% of the services and usually have 2 functions, ± 1. The listener function consumes internal events from the stream and handles the egress of events out to the other system. The trigger function reacts to external events in another system and handles the ingress of those events from that other system.
Control services help minimize coupling between services by mediating the collaboration between services. These typically account for about 10% of the services and usually have 2 functions, ± 1. The listener function consumes events from the stream and then correlates and collates them in a micro events store. The trigger function applies rules to the correlated events and produces higher order events to orchestrate the flow of collaboration.
Grouping functions into services based on well-defined patterns definitely helps increase the predictability of the system and mitigate the complexity introduced by the fine granularity of functions. But the number of functions per service may seem a bit low, because I have purposefully introduced some reasonable optimizations to increase the coarseness of the functions without decreasing the maintainability or testability of each function. Lets take a look at these optimizations.
The graphql function of a BFF service typically defines around 5 queries and 3 mutations. If we were to implement these with REST we would likely end up with 8 functions instead of 1. Fortunately the GraphQL community has established repeatable patterns for creating maintainable and testable schemas and resolvers, such as here.
The listener and trigger functions are stream processors. Stream processing is a natural fit for Functional Reactive Programming (FRP). I have previously provided an example here. I like to use a streams library called Highland. Among its many features, this library allows for forking and merging streams to create parallel substreams (i.e. pipelines). Each pipeline can be implemented and tested separately. A listener or trigger function will typically have 3 or more pipelines. In Control services these pipelines are implemented with declarative rules and a single control service could have a couple dozen rules. Without this optimization each pipeline would have been its own function. I will go into the details of these pipelines in a separate post.
These optimizations greatly reduce the number of functions without straying from the idea of each function having a single responsibility. Each type of function in each pattern plays a well-defined role and nothing more. Given the stats provided above, these optimizations easily reduce the number of functions by a factor of 5. In other words, on my previous project, each service could have had 5–20 functions and the total number of functions could have been 500 instead of 100.
Here is one final thought. Serverless systems have naturally high observability because of the granularity of functions and other fully-managed resources. Leveraging this observability to monitor and alert of the health of the system is another key aspect of manageability. But I will save that topic for another post.