Destructuring services into functions automatically
This article describes an approach for creation of web services for serverless computing platforms. The approach can help to save time, efforts and to lower risks of bugs to pop out by making service development more manageable. The key feature of the approach is automatic destructuring of services into isolated ready-for-deployment functions. Examples in the article are given for GCP Cloud Functions written in Python.
Serverless is a really great concept. If you don’t know what’s that yet, go do some search, it’s worth being familiar with. Like any technology it has its own bunch of advantages and disadvantages.
Several important advantages include precise billing, easy scaling, feature-centric profiling, logging, monitoring and updates. Two key ideas can be extracted from all that: utilize your resources effectively and pay only for what you really use. Ah, and let your computing platform provider carry all maintenance burden.
At the same time nothing comes for free and there are drawbacks tightly tied with advantages. Effective utilization of resources requires functions to be as light as possible and to have as few dependencies as possible. As a result, they end up being independent from each other with little things in common which they can reuse. Why is this bad? Well, many things in the world have other things in common. In the land of web services, try to think about serialization and deserialization, authentication, authorization, retries and logging, to say the least. If your service has more than one web view, all that can easily be applied to each of them and all that can have pretty same logic. Hence, there’s a need to extract things into separate abstractions and use them as dependencies. Doesn’t that sound like a thing we need to avoid? We need to keep things small and independent, but we want them to be unified and consistent. Moreover, we want to minimize efforts needed to be applied at implementation level caused by a change made at more abstract level. And we also want to track such change as a single change via source version control system. Oh, bummer.
At this point you may ask: “Okay, so what’s the matter? It’s been a long time already since we don’t go too low down to assembler while being able to organize, manage and reuse abstractions at the same time”. Well, it’s no that simple: you are in a real world and it can set firm constraints on you no matter you like that or not.
For example, in GCP Cloud Functions the only things you can deploy are a single Python module with code to run and a single requirements file with dependencies to install. And Python web services have to be written with usage of Flask framework. Deal with it.
Just take a minute and think about what you have just seen. This is not how you have been developing before. It’s neither an editor from AWS Lambda or whatever. Just 2 files to define a function. Nothing else. I bet you are not going to publish your proprietary logic shared among views as a public PyPI package just to include it as a dependency via “requirements.txt”. There’s no way to point the engine to your own PyPI server also.
If you are screaming right now, just take your time.
So… is there any way to use Cloud Functions for making more complex services than “hello world”? Even if the issue was not with Cloud Functions, is there any way to deal with the situation in general? Can we keep our reusable logic separately but still have it deployed along with every unique function?
My answer is “yes”. You can develop services as usual and you can automatically extract ready-for-deployment functions out of them. Let’s see how this can be done by deploying a Python service to GCP Cloud Functions.
Example Service Overview
In this article we will use a tiny service IL-2 FB Difficulty Editor as an example. The service allows its users to edit difficulty settings of “IL-2 Sturmovik: Forgotten Battles” flight simulator. Difficulty settings are just boolean flags stored as an integer in a config file. The aim of the service is to allow users to edit settings via human-friendly UI instead of calculating an integer in their heads. This is valuable when you need to config a dedicated server, as it does not have its own UI.
We need only 3 web views to build such service:
- Get UI initial data.
- Decompose user-entered integer.
- Compose new integer when user toggles a settings flag.
These views are pretty simple. But even they have a bunch of logic to share:
Let’s note this one more time: there are just 3 simple views and 3 simple abstractions shared among them. And we can have only 1 Python file per view. If you were too daring to copy-n-paste shared logic across views, you will find this quite painful pretty soon. Just imagine that: you need to make a change to, say, output structure of your HTTP responses. You will end up with same changes to 3 separate files. What if there were X abstractions, Y modules and Z changes? I hope you got it, these should be obvious things. In case you are comfortable with that, please, keep your hands away from keyboard (or whatever you use as input methods).
It would be much better if we were able to extract shared logic to a separate package named, say,
core. Next, it would make sense to keep service’s logic separately as well, e.g., in a package named
difficulty. And let’s keep all that as a project
demo_services, so its structure will be:
│ ├─ cors.py
│ ├─ json.py
│ └─ response.py
In this example all views have same external dependencies which can be listed as:
At this point we have a quite sound code organization we can live with. We can name it as “micromonolith”. Let’s now think about how we can deploy it to Cloud Functions!
Having our views located at
demo_services/difficulty/views module, wouldn’t it be nice to be able to extract them as ready-for-deployment artifacts? Something like the following:
python-object-extractor demo_services.difficulty.views:get_data \
-m get_data/main.py \
That would create
requirements.txt files inside
get_data directory and that would be enough to deploy
views:get_data function to GCP.
With object extractor installed, you are able to execute the command above and get exactly what you expect.
views:get_data function will be destructured into its own
main.py will contain the function itself with all objects from local
serializers module and
core package it depends on, while
requirements.txt will list all external dependencies needed for all extracted code to work.
How does this work?
Good question. Remember everything in Python is an object? Functions, classes, types, modules, packages and so on are objects. You can reference them, load them, inspect them, get their attributes and source code and so on.
So, you can load a target object, parse its code to grab tokens, look for definitions of those tokens in object’s module and in module’s imports and take only objects referenced directly. If it’s an import from an external library, list that library as a dependency. Repeat this logic to all local dependencies recursively, resolve imports, resolve name conflicts, sort objects topologically and output sources of all objects as a single module with list of dependencies.
Sounds a bit non-trivial, huh? Definitely yes, it’s a non-trivial task which will appear even more complex when you start digging into details. Fortunately, you don’t need to crack your head trying to solve that. Just use the extractor.
What about external dependencies?
To make extraction to work you need to have all external dependencies to be installed locally. This should not be an issue, as you can always use environment isolation tools like
Which versions will be listed in output
The answer is: whatever version is installed locally inside current active
site-packages directory. Installed packages usually have
*.dist-info directories containing metadata about them. That’s where versions are usually taken from.
Now we have two representations of same solution: one is original source and the other is its generated mirror. It would be reasonable to ask: what should we track in source version control system and how?
Ewww. That’s a bit tough question, because that smells like a duplication. But this little evil is needed.
Firstly, extraction can be automated. Secondly, generated code can be kept in an isolated branch. In this case it’s possible to set up continuous deployment as described in the next section. Checking-in code generated from one branch into another is left as an exercise up to the reader, though.
The most convenient way to deploy your functions is to set up a push trigger for a branch of a source repository containing extracted functions. To do that you’ll need a
cloudbuild.yaml file, which describes deployment steps.
Please, refer to “Triggering Cloud Functions deployments” article to see explanation of deployment steps needed for Cloud Functions. Example build file needed to deploy our view is shown below:
- name: 'gcr.io/cloud-builders/gcloud'
'beta', 'functions', 'deploy',
'--region', 'europe-west1', # watch out your region
waitFor: ['-'] # wait for nobody, parallelize
The image below depicts definition of example trigger.
For a full example of working build file refer to demo-services/cloudbuild.yaml.
Example report for a build triggered by a push to source repo is shown below:
Each build will trigger redeployment of functions. Example of deployed function’s dashboard is shown below:
We are able to destructure our views and push them as functions to a serverless platform now. But how we are going to develop and test them at all? There’s no web app yet, all we have is just a bunch of web views.
No problem, let’s make an app then. And let’s make it to load and register our views dynamically.
First of all, let’s define
functions.yml with definitions of functions exposed by the project:
- name: difficulty-get-data
That is needed to map names of views to corresponding functions. And this file might be used to generate
Next, it makes sense to define local endpoints. One name and a list of HTTP methods per single endpoint is enough for our purposes. Let’s define endpoints in
local_endpoints.yml file and here’s definition of our example endpoint:
- name: difficulty-get-data
methods: ['GET', 'OPTIONS']
OPTIONS method is needed to make CORS available. We want to keep things as similar to production as possible, right?
If you have taken a look into the definition of
get_data view, you might have noticed it needs
request object as its first parameter:
pretty = 'pretty' in request.args
This object is just the same
flask.request which you have been using in usual Flask projects before.
The issue is that it’s passed as an argument only in Cloud Functions environment. Plain Flask does not pass it as an argument, because in plain Flask it’s a global object which has to be imported.
So, how to make our view to run as a function in GCP and as a plain Flask view locally without making changes to it? Let’s inject
request into the local version of view!
That can be easily achieved by making
from functools import wraps
from flask import request
def wrapper(*args, **kwargs):
return view_func(request, *args, **kwargs)
That’s it: just take any function and force
request to be its first parameter.
Finally, it’s time to make a web app. We’ll build a simple application dynamically. To achieve that we need:
- Read definitions of endpoints from
- Read definitions of functions from
- Load functions and inject
requestobject as argument into them.
- Setup environment vars. In our example we’d like to bypass CORS during local development, so
CORS_ALLOW_ORIGINvariable is set to
- Construct URL rules for Flask app.
- Make Flask app and run it.
Implementation of that is available as
local_run.py. Execution of runner spawns a plain Flask app:
* Serving Flask app "local_run" (lazy loading)
* Environment: production
WARNING: Do not use the development server in a production environment.
Use a production WSGI server instead.
* Debug mode: on
* Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
* Restarting with stat
* Debugger is active!
* Debugger PIN: 935-883-901
Phew! We are able to run same code locally and to deploy it to GCP!
Serverless is a concept definitely worth being studied and used. But not all tooling around it is perfect or at least simply good. Sometimes you are left all alone with lots of manual work. This article described a possible way to make usage of GCP Cloud Functions easier. Was is a successful endeavour? I hope it was, but depending on the circumstances the answer is: might be.
What is definitely true is that you can develop services as usual and extract separate views from them automatically. Deploys of that views can be done via source repo triggers automatically. And an application for local development can be build automatically as well. Feels like a good stuff.
But there are several questions still remaining open.
First of all, using 2 branches or 2 repos for same code doesn’t feel healthy nor totally convenient.
Second, there’s a lack of ability to deploy only changed functions: push of sources triggers redeployment of all declared functions.
Next, a gap between local and production environments can happen to be a way bigger than just a difference in function’s args.
Finally, it’s up to you to put together many stuff, including running functions as views of a local app.
Let’s hope a toolset around serverless will become more mature in future.