Painless Serverless

Destructuring services into functions automatically

Synopsis

This article describes an approach for creation of web services for serverless computing platforms. The approach can help to save time, efforts and to lower risks of bugs to pop out by making service development more manageable. The key feature of the approach is automatic destructuring of services into isolated ready-for-deployment functions. Examples in the article are given for GCP Cloud Functions written in Python.

Rationale

Serverless is a really great concept. If you don’t know what’s that yet, go do some search, it’s worth being familiar with. Like any technology it has its own bunch of advantages and disadvantages.

Several important advantages include precise billing, easy scaling, feature-centric profiling, logging, monitoring and updates. Two key ideas can be extracted from all that: utilize your resources effectively and pay only for what you really use. Ah, and let your computing platform provider carry all maintenance burden.

At the same time nothing comes for free and there are drawbacks tightly tied with advantages. Effective utilization of resources requires functions to be as light as possible and to have as few dependencies as possible. As a result, they end up being independent from each other with little things in common which they can reuse. Why is this bad? Well, many things in the world have other things in common. In the land of web services, try to think about serialization and deserialization, authentication, authorization, retries and logging, to say the least. If your service has more than one web view, all that can easily be applied to each of them and all that can have pretty same logic. Hence, there’s a need to extract things into separate abstractions and use them as dependencies. Doesn’t that sound like a thing we need to avoid? We need to keep things small and independent, but we want them to be unified and consistent. Moreover, we want to minimize efforts needed to be applied at implementation level caused by a change made at more abstract level. And we also want to track such change as a single change via source version control system. Oh, bummer.

At this point you may ask: “Okay, so what’s the matter? It’s been a long time already since we don’t go too low down to assembler while being able to organize, manage and reuse abstractions at the same time”. Well, it’s no that simple: you are in a real world and it can set firm constraints on you no matter you like that or not.

For example, in GCP Cloud Functions the only things you can deploy are a single Python module with code to run and a single requirements file with dependencies to install. And Python web services have to be written with usage of Flask framework. Deal with it.

GCP Console — Creation of cloud function written in Python

Just take a minute and think about what you have just seen. This is not how you have been developing before. It’s neither an editor from AWS Lambda or whatever. Just 2 files to define a function. Nothing else. I bet you are not going to publish your proprietary logic shared among views as a public PyPI package just to include it as a dependency via “requirements.txt”. There’s no way to point the engine to your own PyPI server also.


If you are screaming right now, just take your time.


So… is there any way to use Cloud Functions for making more complex services than “hello world”? Even if the issue was not with Cloud Functions, is there any way to deal with the situation in general? Can we keep our reusable logic separately but still have it deployed along with every unique function?

My answer is “yes”. You can develop services as usual and you can automatically extract ready-for-deployment functions out of them. Let’s see how this can be done by deploying a Python service to GCP Cloud Functions.

Example Service Overview

In this article we will use a tiny service IL-2 FB Difficulty Editor as an example. The service allows its users to edit difficulty settings of “IL-2 Sturmovik: Forgotten Battles” flight simulator. Difficulty settings are just boolean flags stored as an integer in a config file. The aim of the service is to allow users to edit settings via human-friendly UI instead of calculating an integer in their heads. This is valuable when you need to config a dedicated server, as it does not have its own UI.

Example service “IL-2 FB Difficulty Editor” — User interface

We need only 3 web views to build such service:

  1. Get UI initial data.
  2. Decompose user-entered integer.
  3. Compose new integer when user toggles a settings flag.

These views are pretty simple. But even they have a bunch of logic to share:

  1. JSON serialization.
  2. Formatting of HTTP responses.
  3. CORS handling.

Let’s note this one more time: there are just 3 simple views and 3 simple abstractions shared among them. And we can have only 1 Python file per view. If you were too daring to copy-n-paste shared logic across views, you will find this quite painful pretty soon. Just imagine that: you need to make a change to, say, output structure of your HTTP responses. You will end up with same changes to 3 separate files. What if there were X abstractions, Y modules and Z changes? I hope you got it, these should be obvious things. In case you are comfortable with that, please, keep your hands away from keyboard (or whatever you use as input methods).

It would be much better if we were able to extract shared logic to a separate package named, say, core. Next, it would make sense to keep service’s logic separately as well, e.g., in a package named difficulty. And let’s keep all that as a project demo_services, so its structure will be:

demo_services
├─ core
│ ├─ cors.py
│ ├─ json.py
│ └─ response.py
└─ difficulty
├─ serializers.py
└─ views.py

In this example all views have same external dependencies which can be listed as:

Flask==1.0.2
il2fb-difficulty==1.3.1
itsdangerous==0.24

At this point we have a quite sound code organization we can live with. We can name it as “micromonolith”. Let’s now think about how we can deploy it to Cloud Functions!

Monolith Destructuring

Having our views located at demo_services/difficulty/views module, wouldn’t it be nice to be able to extract them as ready-for-deployment artifacts? Something like the following:

python-object-extractor demo_services.difficulty.views:get_data \
-m get_data/main.py \
-r get_data/requirements.txt

That would create main.py and requirements.txt files inside get_data directory and that would be enough to deploy views:get_data function to GCP.

If you agree that would be nice, you are welcome: just go grab python-object-extractor package from PyPI and take a look over its docs for usage examples.

With object extractor installed, you are able to execute the command above and get exactly what you expect.

For example, views:get_data function will be destructured into its own main.py and requirements.txt files. main.py will contain the function itself with all objects from local serializers module and core package it depends on, while requirements.txt will list all external dependencies needed for all extracted code to work.

How does this work?

Good question. Remember everything in Python is an object? Functions, classes, types, modules, packages and so on are objects. You can reference them, load them, inspect them, get their attributes and source code and so on.

So, you can load a target object, parse its code to grab tokens, look for definitions of those tokens in object’s module and in module’s imports and take only objects referenced directly. If it’s an import from an external library, list that library as a dependency. Repeat this logic to all local dependencies recursively, resolve imports, resolve name conflicts, sort objects topologically and output sources of all objects as a single module with list of dependencies.

Sounds a bit non-trivial, huh? Definitely yes, it’s a non-trivial task which will appear even more complex when you start digging into details. Fortunately, you don’t need to crack your head trying to solve that. Just use the extractor.

What about external dependencies?

To make extraction to work you need to have all external dependencies to be installed locally. This should not be an issue, as you can always use environment isolation tools like venv.

Which versions will be listed in output requirements?

The answer is: whatever version is installed locally inside current activesite-packages directory. Installed packages usually have *.dist-info directories containing metadata about them. That’s where versions are usually taken from.

Version Control

Now we have two representations of same solution: one is original source and the other is its generated mirror. It would be reasonable to ask: what should we track in source version control system and how?

Ewww. That’s a bit tough question, because that smells like a duplication. But this little evil is needed.

Firstly, extraction can be automated. Secondly, generated code can be kept in an isolated branch. In this case it’s possible to set up continuous deployment as described in the next section. Checking-in code generated from one branch into another is left as an exercise up to the reader, though.

Deployment

The most convenient way to deploy your functions is to set up a push trigger for a branch of a source repository containing extracted functions. To do that you’ll need a cloudbuild.yaml file, which describes deployment steps.

Please, refer to “Triggering Cloud Functions deployments” article to see explanation of deployment steps needed for Cloud Functions. Example build file needed to deploy our view is shown below:

# cloudbuild.yaml
steps:

- name: 'gcr.io/cloud-builders/gcloud'
args: [
'beta', 'functions', 'deploy',
'difficulty-get-data',
'--entry-point', 'get_data',
'--set-env-vars', 'CORS_ALLOW_ORIGIN=*',
'--trigger-http',
'--runtime', 'python37',
'--memory', '128MB',
'--region', 'europe-west1', # watch out your region
]
waitFor: ['-'] # wait for nobody, parallelize
dir: 'get_data'

The image below depicts definition of example trigger.

GCP Cloud Build —Example of source repository trigger definition

For a full example of working build file refer to demo-services/cloudbuild.yaml.

Example report for a build triggered by a push to source repo is shown below:

GCP Cloud Build — Example of a source repository-triggered build report

Each build will trigger redeployment of functions. Example of deployed function’s dashboard is shown below:

GCP Cloud Functions — Example function’s dashboard

Local Development

We are able to destructure our views and push them as functions to a serverless platform now. But how we are going to develop and test them at all? There’s no web app yet, all we have is just a bunch of web views.

No problem, let’s make an app then. And let’s make it to load and register our views dynamically.

functions.yml

First of all, let’s define functions.yml with definitions of functions exposed by the project:

- name: difficulty-get-data
handler: demo_services.difficulty.views:get_data

That is needed to map names of views to corresponding functions. And this file might be used to generate cloudbuild.yaml also.

local_endpoints.yml

Next, it makes sense to define local endpoints. One name and a list of HTTP methods per single endpoint is enough for our purposes. Let’s define endpoints in local_endpoints.yml file and here’s definition of our example endpoint:

- name: difficulty-get-data
methods: ['GET', 'OPTIONS']

Note, OPTIONS method is needed to make CORS available. We want to keep things as similar to production as possible, right?

Request injector

If you have taken a look into the definition of get_data view, you might have noticed it needs request object as its first parameter:


def get_data(request):
pretty = 'pretty' in request.args

This object is just the same flask.request which you have been using in usual Flask projects before.

The issue is that it’s passed as an argument only in Cloud Functions environment. Plain Flask does not pass it as an argument, because in plain Flask it’s a global object which has to be imported.

So, how to make our view to run as a function in GCP and as a plain Flask view locally without making changes to it? Let’s inject request into the local version of view!

That can be easily achieved by making with_request decorator:

from functools import wraps

from flask import request


def with_request(view_func):

@wraps(view_func)
def wrapper(*args, **kwargs):
return view_func(request, *args, **kwargs)

return wrapper

That’s it: just take any function and force request to be its first parameter.

local_run.py

Finally, it’s time to make a web app. We’ll build a simple application dynamically. To achieve that we need:

  1. Read definitions of endpoints from local_endpoints.yml.
  2. Read definitions of functions from functions.yml.
  3. Load functions and inject request object as argument into them.
  4. Setup environment vars. In our example we’d like to bypass CORS during local development, so CORS_ALLOW_ORIGIN variable is set to *.
  5. Construct URL rules for Flask app.
  6. Make Flask app and run it.

Implementation of that is available as local_run.py. Execution of runner spawns a plain Flask app:

 * Serving Flask app "local_run" (lazy loading)
* Environment: production
WARNING: Do not use the development server in a production environment.
Use a production WSGI server instead.
* Debug mode: on
* Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
* Restarting with stat
* Debugger is active!
* Debugger PIN: 935-883-901

Phew! We are able to run same code locally and to deploy it to GCP!

Conclusions

Serverless is a concept definitely worth being studied and used. But not all tooling around it is perfect or at least simply good. Sometimes you are left all alone with lots of manual work. This article described a possible way to make usage of GCP Cloud Functions easier. Was is a successful endeavour? I hope it was, but depending on the circumstances the answer is: might be.

What is definitely true is that you can develop services as usual and extract separate views from them automatically. Deploys of that views can be done via source repo triggers automatically. And an application for local development can be build automatically as well. Feels like a good stuff.

But there are several questions still remaining open.

First of all, using 2 branches or 2 repos for same code doesn’t feel healthy nor totally convenient.

Second, there’s a lack of ability to deploy only changed functions: push of sources triggers redeployment of all declared functions.

Next, a gap between local and production environments can happen to be a way bigger than just a difference in function’s args.

Finally, it’s up to you to put together many stuff, including running functions as views of a local app.

Let’s hope a toolset around serverless will become more mature in future.