Maintaining REST API Documentation with Node.js — Part II

Nelson Gomes
Pipedrive R&D Blog
Published in
9 min readJan 11, 2022
Combining multiple OpenApi definitions into one, automatically updated

This article details how you can mingle together multiple individual service OpenApi schemas into a global one, remapping their routes, making it easy to keep your entire API up to date.

The first part of this article was well received, some issues were resolved and improvements made. We would like to thank you for the feedback and inputs. It’s with great pleasure that we post the second part of this article and also share some great news with the community.

New features introduced in version 1.0.0 of ts-openapi, a breaking change update:

  • Methods can now be marked as deprecated
  • Support for data models was added
  • Support for parameters was added
  • Support added to combine multiple service definitions (check the last section of this article)

Documentation has been updated and a sample server was also updated to this version. The only limitation, for now, is that it only works with OpenApi 3.0.x schemas. Until swagger-ui-express or swagger-ui supports 3.1.x versions I think it doesn’t make much sense to upgrade.

In this article, we will cover four topics: the first two are relatively simple, and the latter are intricate yet fascinating. We will explain how to fetch different service definitions and use them to keep your full API up to date automatically. Check the mingle server demo video at the bottom of the article for reference.

Below is a list of topics that we cover in this article:

Part II:

  • Creating a validation middleware to make your APIs rock solid
  • Automating documentation generation
  • Fetching different service schemas
  • Combining multiple OpenAPI schemas to keep your full API spec up to date

Step 1: Creating a middleware to make your APIs rock solid

In the context of express, a middleware is a function used before and/or after a request handle, which can become powerful using Joi as an input sanitizer. However, it’s important to not only verify that all of its parameters exist and are correct but also check whether the request contains any elements that were not declared.

For example, imagine we had a body definition and a middleware to validate it. Then, imagine we used the spread (…) operator for a create or an update with knex. If you think the body is good to go, think again! Joi allows unknown parameters to be passed into the body by default. That is, its function is only to validate the parameters we’ve entered. This means that even if you validated the correct parameters exist, other parameters might change database columns if they match a column’s name. So, in some cases, like the body, it’s crucial to make sure Joi does not allow unknowns in the schema.

We need to validate four different data inputs, each with different rules:

  • Body: Here, it is safe to have some optional parameter presence since we should not allow unknowns for security reasons
  • Query: You can allow unknowns, but it’s safer to just block them. Query parameters are optional.
  • Params: No unknowns allowed, and all are required since they are part of the URL, and we don’t want any surprises here
  • Headers: We must allow unknowns here since we don’t know the headers sent by all browsers. Note that header names must be lowercase, as this is how Express delivers them internally. Header parameters are optional.

With this setup, we have hardened validations for input and made our API a lot safer. You can use cookie validations if you use cookies. Keep in mind that your schema definition for each data source must match your middleware defaults for required and optional fields. We recommend always marking parameters as either optional or required.

Next, let’s create a middleware for each endpoint using the function that returns a validation function per endpoint, according to your definition:

Code to generate middleware functions

We have used this middleware in our demo server, under “list customers” and “create customer,” if you want to take a look.

Step 2: Automating documentation generation

Next, select your tools of choice. One of the amazing things about OpenApi is the wide selection of tools available, like code generators for different languages, validation tools, documentation, etc.

For the purpose of this demo, we’ll be using widdershins.

Steps to generate your documentation:

  1. install widdershins: npm i -g widdershins
  2. Start your local server: node ./src/server.js
  3. Generate documentation manually from the URL of the server:
    widdershins -c --summary “http://localhost:8000/api-json" -o api.md
    We excluded code generation to get plain textual documentation, but there are plenty of different options available for you to choose from, like language selection and even custom templates
  4. You may also add it to your package.json, inside scripts:
    “scripts”: {
    “documentation”: “npx widdershins -c — summary
    \“http://localhost:8000/openapi.json\” -o api.md”
    }, (…)
    and call it by running npm run documentation. Just don’t forget to start your server first!

By following these steps, you will ensure your documentation is readable and always up to date:

API documentation, automatically generated

Step 3: Fetching different service schemas

To this point, we have only discussed a single service and explained how to generate an OpenApi schema for it. But what if you have hundreds of services, each with its own schema and documentation? Even worse, what if you must keep the public documentation up to date?

A nightmare for all technical writers

Usually, large APIs don’t have a single service that implements them, but hundreds of services working together to create a complete API, each with a few or hundreds of instances, depending on the number of requests or even their complexity.

What’s behind that “magic”? The answer is load-balancers or reverse-proxies of some sort, like Nginx, Traefik, HAProxy, Perlbal and, for those who can afford it, hardware optimized ssl-accelerators.

These load-balancers usually deal with HTTPS layer and deliver HTTP traffic to your internal network. More specifically, to the service that handles it (assuming it’s safe to do so). So internal services handle regular unencrypted traffic within their datacenter, which is light to process and reply with a response to the load-balancer to encrypt back to the customer’s browser.

Mingle multiple service schemas concept

Let’s assume we wanted to implement “Awesome API Explorer” with three services: users, products and emails.

In which case, requests to https://awesome.api/users/* would need to be routed to internal service users http://users/public/* . This means that whenever a request matches /users/*, the load balancer would rewrite the request path to /public/* and deliver it to an instance of that service to handle. Next, let’s assume this service schema was available at http://users/private/openapi.json.

https://awesome.api/products/* requests would be routed to products service http://products/* and schema would be available at http://products/openapi.json

https://awesome.api/emails/* requests would be routed to emails service http://emails/api/* and schema would be available at http://emails/api-schema.json

So, from the above service definitions, we identify three components: public path, private path and schema location (used for updates).

One important note about security: Some services have both private and public APIs. In which case, you should create your API in a way that doesn’t expose it to risks. Not declaring your methods publicly doesn’t cut it risk-wise, because as long as it can be reached, it can also be hacked.

We suggest that you keep service schemas private since they might give away information about your service infrastructure to hackers.

In short, we need to fetch three schemas at the below addresses, convert paths from private to public paths and then declare new public server addresses:

http://users/private/openapi.json, map paths from /public/* to /users/*
http://products/openapi.json, map paths from /* to /products/*
http://email/api-schema.json, maps paths from /api/* to /emails/*

Retrieving a few schemas from different URLs is trivial. The most important step is to create your own update rules and system. If you need to update your API schema immediately, we recommend a trigger mechanism, like a queue or a webhook that receives an update event, reads the new schema from a specific service and then produces a new combined API. Otherwise, you can update your API at regular intervals, every minute or hour, depending on your needs. If an error occurs during service mingling, make sure to log it but keep the previous mingled service definition active, just in case and until the error is fixed.

Next, let’s create a list with our service configurations:

const services = {
"users": {
schemaUrl: "http://127.0.0.1:2000/private/openapi.json",
publicPrefix: "/users/",
privatePrefix: "/public/",
type: "consul"
},
"products": {
schemaUrl: "http://127.0.0.1:3000/openapi.json",
publicPrefix: "/products/”,
privatePrefix: "/",
type: "static"
},
"emails": {
schemaUrl: "http://127.0.0.1:4000/api-schema.json",
publicPrefix: "/emails/",
privatePrefix: "/api/",
type: "consul"
}
}

With this list, we can collect each individual definition and combine them to generate a global one. Note that we’ve assigned each definition a different type. The reason is consul-based services may have their IP/port changed, so prior to retrieving them, we should always update their address/port. You may need to do the same with other service discovery systems.

Our public API will be kept up to date with our mingle server, full demo available at https://github.com/nelsongomes/server/tree/main/src/mingle-demo.ts:

const app: Application = express();// read initial schema
let currentSchema = await getMingledApi();
// creates an endpoint to reply with openapi schema
app.get('/api/openapi.json', function (_req, res) {
res.setHeader("Cache-Control", "no-store, must-revalidate");
res.setHeader("Expires", "0");
res.json(currentSchema);
});
const options = {
swaggerOptions: {
url: '/api/openapi.json',
},
};
// this will make openapi UI available with our definition
app.use("/api",
swaggerUi.serveFiles(undefined, options),
swaggerUi.setup(undefined, options)
);
// try to refresh schema every X seconds
setInterval(async () => {
try {
currentSchema = await getMingledApi();
} catch (e) {
// log any errors during service mingle attempt plus failed attempts should
// not crash server because we keep state from latest attempt
console.log(e);
}
}, REFRESH_SECONDS * 1000);
// start server
app.listen(PORT, function () {
console.log(`Server is listening on port ${PORT}! Click http://127.0.0.1:${PORT}/api/`);
});

This server will make our public API interface available at http://127.0.0.1:8888/api/ and our API schema available at http://127.0.0.1:8888/api/openapi.json and will try to update its schema every minute.

Step 4: Combining multiple OpenAPI schemas to keep your full API spec up to date

Combining multiple service schemas is not an easy task for developers. You need to:

  • Filter private routes (not accessible from outside)
  • Verify if operation IDs are unique across all services
  • Remap private paths to public paths
  • Check if schemas are being referenced (in body and each of the possible response codes) and copy them
  • Check if a schema declared by multiple services has the same value (we should avoid shared schemas when possible to avoid conflicts)
  • Check if a schema reference is referencing internally other schemas and copy those, too
  • Check if parameters are being referenced and copy them
  • Check if security schemes match the declaration
  • Check local security schemes (method level) against declared ones and verify scopes

A note about models: We recommend that you define domain-driven schemas so that only one service has all the methods to handle products and can declare a model for it. If you have multiple services declaring the same model, you might end up with conflicting models. Therefore, and to avoid multiple definitions of a Product, all services needing Product data, should call the same service for it.

To test mingle server, download the code from https://github.com/nelsongomes/server. Change file src/mingle-demo.ts to add your configuration and run it doing ts-node src/mingle-demo.ts .

To illustrate this article, we’ve created a few replicas of our demo service, with the abovementioned configurations, some security schemes, some models and properties. You can also access the mingling server’s UI at http://127.0.0.1:8888/api/.

Mingle server demo

To test your schema, we recommend the following tool: https://editor.swagger.io/, just copy your JSON schema into the tool, and it will point out any schema issues you might have.

To support your setup, don’t forget to set up your load balancer properly by routing:

  • /api/ to mingle server
  • /users/ to users service
  • /products/ to users service
  • /emails/ to emails service

Finally, given that we now have a way to ensure our public API is always up to date, we can easily automate documentation for it by using widdershins, as explained in step 2.

This project was born to help us tackle combined distributed schemas for large APIs. Since it has become increasingly ambitious, you might find some bugs and issues in our workflow. If so, please share your experience with us. We hope you make good use of information mentioned in this article and would like to thank you for reading.

Relevant Links:

Other Resources used in this article:

Interested in working in Pipedrive?

We’re currently hiring for several different positions in several different countries/cities.

Take a look and see if something suits you

Positions include:

  • Junior Developer
  • Full-Stack Developer
  • Lead Engineer
  • Junior Software Engineer in DevOps Tooling
  • Senior Front-End Developer
  • Infrastructure Engineer
  • And several more

--

--

Nelson Gomes
Pipedrive R&D Blog

Works in Pipedrive as senior SRE, has a Degree in Informatics from University of Lisbon and a Post-grad in project management.