An API Platform as Code: From DSL to configuration

Peder Landsverk
Sopra Steria Norge
Published in
7 min readAug 17, 2023
Photo by Lance Anderson on Unsplash

This article is about how we built an API publishing platform by treating the CI/CD process as a software engineering problem. This first article describes how our CI/CD toolkit defines a DSL that is used by our platform users and turns that DSL into actionable configuration for Terraform. While this concrete case targets Azure API Management and associated Azure resources specifically, the principles and practices shown here could apply to any underlying software or content platform.

All platforms have two key features, that are often in direct competition: On the one hand, you want to ensure that users do the right thing on your platform, either through nudging and documentation, or stricter measures like pre-merge checks. On the other hand, you want to make working on your platform as flexible and empowering as working outside of it. While the goals of platform requirements might be important enough to warrant more invasive checks and routines for developers, it is also extremely important to make the platform as non-invasive as possible to allow developers to fulfill their own product requirements.

Ideally, platform and developer requirements are well aligned, making the platform an empowering tool rather than a hurdle. To get the platform development team into the right frame of mind with regards to achieving the balance between control and freedom, we decided to treat our platform users as customers, and design our platform to be as pleasant and empowering as possible.

DSL

In our case, this involved designing a YAML DSL (domain specific language) that allows developers to deploy and configure all of the resources required to expose their API through a friendly interface. While YAML [isn’t a great configuration language](https://github.com/cblp/yaml-sucks), it is very widely used, and uniquely readable. Its readability makes it a great interface for many kinds of developers and stakeholders. These two snippets should be legible to both developers familiar with YAML, and auxiliary staff that, for example, handle API support and product management (and might be more averse to curly braces):

name: kbase 
support:
email: a.person@example.com
name: Albert Persson
description: |
An API that allows you to manage our internal knowledgebase.
auth:
subscription_key: true
rate_limit:
limit: 30
per: second
versions:
dev:
description: An unstable version of the API that gets the latest updates.
backend: https://dev.kbase.example.com
development: true
main:
description: The stable version of the API that you can use in production.
backend: https://api.kbase.example.com
product:
- internal-documentation

This configuration file defines many of the key characteristics of the API that will be exposed. In fact, committing a file like this to the right place in our platform repository will create an API that is configured according to this specification, and has all of the supporting resources required for it to function. The rest of this article is about the first leg of this journey; how these configuration files are turned into machine interpretable configurations.

Configuration to Resources

This nice facade hides a lot of complexity from our platform users. The configuration will be turned into resources in the cloud, including API Management APIs and associated resources like API policies, app registrations and roles, user groups, products, and more. This last step of turning data into resources is a job for an IaC tool.

Terraform is a well established industry standard with a large community around it, and is thus a natural choice when considering the various IaC offerings available today. It allows us to not only create the required resources from the configuration, but also delete resources that are no longer declared, due to its unique way of managing state. This ideally makes the repository containing the API platform configuration a 1:1 representation of what will actually be present and visible to our users.

Of course, Terraform does not understand our nice YAML format. The configuration must be transpiled into JSON, which can be provided to the `terraform plan` command. In some cases, the YAML can be directly converted into JSON, and in others, we perform validation and template rendering before outputting the data. To do this, we wrote a Python package that provides a CLI tool that is run during the deployment pipeline.

Pipeline as software

CI/CD processes can be set up and configured in many different ways, from the very simple to the more complex. The most common way to define pipelines is using pre-defined steps that are parameterized in a YAML file, interspersed with shell commands that let you do “custom” things.

Our repository evolved from this, using increasingly complicated Bash scripting to perform the various data transformation and check steps that were needed, until we realized that what we were actually doing, was developing software. While not everyone might agree with this statement, we regard Bash as an unsuitable language for professional software development. Instead, we decided as developers to handle the development of our CI/CD pipeline with the same tools that we were using to develop our other software projects.

This also allowed us to leverage the stack of quality control tools that we apply elsewhere: Flake8, unit-tests with Pytest, isort and Black. Having these quality control measures in place creates a sense of confidence in our resulting pipeline that lets us more freely add new features and improvements, and also perhaps makes it easier to sleep at night.

In Action

So what does our CI/CD package do? To give an example, the rate limit block from the above configuration needs to be transpiled into an XML statement, that must be provided as an `api_version_policy` resource to ARM. We use the excellent Jinja2 templating engine, with the following template for this specific case:

...
{% if api.rate_limit is not none %}
{% if api.auth.subscription_key %}
<rate-limit
{% else %}
<rate-limit-by-key
counter-key="@(context.Request.IpAddress)"
{% endif %}
calls="{{ api.rate_limit.limit }}"
renewal-period="{{ 1 if api.rate_limit.per == "second" else 60 }}"
remaining-calls-variable-name="remaining-api-calls"
remaining-calls-header-name="Remaining-Api-Calls"
/>
{% endif %}
...

Not only does this template let us input the provided values for the rate limit from the YAML file, but it also lets us alter the resulting policy based on the desired authentication method, rate limiting by IP if the API does not use a subscription key. Of course, the same setting is also used to determine whether the deployed API actually requires a subscription key or not.

In addition, the rate limit information is also rendered into the API documentation, which is thus always kept up to date with what is actually deployed. This kind of autodocumentation is used in many places, ensuring that the state in the cloud is always reflected in the documentation shown to API consumers:

<p>
Managed by
<a href="mailto:{{ api.support.email }}">{{ api.support.name }}</a>.
</p>

...

{% if api.rate_limit %}
<p>
This api has a general rate limit of {{ api.rate_limit.limit }} requests
per {{ api.rate_limit.kind }}.
</p>
{% endif %}

...

We define several Pydantic models for our desired inputs and outputs. This allows us to leverage the excellent data validation capabilities of Pydantic, and also makes it easy to write code that works on the configuration objects after parsing them. For example, we do integrity checks ensuring that all of the products referred to by the APIs are actually defined in the repository when assembling the configuration data into the final object that will be provided as input to Terraform. This allows our pipeline to fail faster, as it does not have to try applying a faulty configuration with an invalid product name, but can detect such errors when reading in the configuration:

def check_api_product_consistency(
config: cicd.models.configuration.config,
) -> optional[str]:
"""check that all apis refer to existing products."""
errors = []
product_names = {p.name for p in config.products}
for api in config.apis.values():
for product in api.products:
if product not in product_names:
errors += [f"api {api.name} refers to undefined product {product}"]

return "\n".join(errors) if errors else none

If all of the functions with this signature return None (i.e. there are no errors), the script outputs a JSON file called `terraform.tfvars.json` which is ingested by Terraform and turned into resources. Since the configuration has been pre-validated, there chance of tricky deploy-time errors is reduced, meaning that a pre-merge checks of gathering and validating the configuration, and then creating a Terraform plan, will catch most consistency errors.

Summing up

This article has shown how we designed a friendly YAML face for our API management platform, making it easy for anyone to define new APIs, while maintaining integrity guarantees via flexible and powerful validation, and consistent documentation and functionality by rendering templates. Empowering developers in this way has led to rapid adoption and high satisfaction among our users, which is our main priority when developing the platform further.

The platform makes it easy and pleasant for developers to do the right thing, and to automate certain boring tasks like keeping documentation up to date. Helping and nudging developers in this way is the best and most consistent way to make sure that your platform is used, and that its content is of high quality.

Maintaining this increasingly complicated process as an in-repo code library allows us to apply best-practice software engineering practices to the CI/CD process, making the process more trustworthy and predictable. This leads to a platform that is powerful, reliable, and trusted by our users, providing lots of functionality while exposing a clean and friendly interface.

In a coming article, I will show how the emitted configuration is used in our CI/CD pipeline to create the resources needed to make the code definition into actual resources.

--

--