Documentation-driven development for APIs: what is it, how do you do it and why you should do it?

Jose Haro Peralta
Python Geek
Published in
10 min readOct 19, 2020

The code for this post is available under: https://github.com/abunuwas/documentation-driven-development

Photo by Nasa on Unsplash

Documentation-driven development is an approach to API development where you write the documentation first, and then implement the API as per the specification. If you have any clients of the API within your system (for example a frontend application) then you implement them against the specification as well. This approach is often also called API-first.

There’s often this idea that API changes should be driven by the backend, and that the backend can change the API at any time and then the API client (for example a frontend application) has to comply with whichever arbitrary changes were made to the backend.

Many developers have the idea that you can’t start working on an API client (for example a frontend application) until the backend API is implemented. This is simply not true: if you write the documentation first, then both the client application and the API server can be implemented at the same time.

However, not everything counts as documentation when it comes to APIs. APIs must be documented in standard formats, such as OpenAPI, so that we can leverage the benefits of documentation-driven development. I’ve seen many teams documenting their API using tools such as Confluence, Google Docs, Sharepoint, Dropbox Papers, and similar. This doesn’t work because we can’t generate standard specifications from them that we can use in combination with other tools to test our API implementation.

How does this work in practice? And is it beneficial at all? I’m going to show you how to practice documentation-driven development to develop a REST API with Flask. The approach is the same for any other kind of API and it works with any other framework.

We’ll implement a simple API to manage a list of to-do items. The schema for the to-do item will have the following attributes:

  • ID in UUID format
  • Created timestamp
  • Task as a string
  • Status as a string, which can be one of the following: pending, progress, completed

The attributes ID created will be set by the backend and therefore will be read-only for the API client. The attribute task represents the task that has to be done, and the status attribute represents the status of the task and can only take one of the enumerated values. For best practice and reliability, we’ll invalidate any requests that include any properties not listed in the schema.

We’ll have two URL paths to manage our list of to-do items:

  • /todo
  • /todo/{item_id}

/todo represents the collection of to-do items, and we’ll use it to fetch a list of our to-do items and to create new to-do items. We’ll use /todo/{item_id} to manage specific tasks, and we’ll use it to retrieve the details of a specific task, to update it, and to delete it.

Now that we have gone through the process of designing our API, let’s document it first before we jump onto the implementation!

I mentioned at the beginning that we’d be implementing a REST API, so we’ll use OpenAPI to document it. Create a file called oas.yaml and write the following content to it:

openapi: 3.0.3info:
title: TODO API
description: API that allows you to manage a to-do list
version: 1.0.0
paths:
/todo/:
get:
summary: Returns a list of to-do items
responses:
'200':
description: A JSON array of tasks
content:
application/json:
schema:
type: array
items:
$ref: '#/components/schemas/GetTaskSchema'
post:
summary: Creates an task
requestBody:
required: true
content:
application/json:
schema:
$ref: '#/components/schemas/CreateTaskSchema'
responses:
'201':
description: A JSON representation of the created task
content:
application/json:
schema:
$ref: '#/components/schemas/GetTaskSchema'
/todo/{item_id}:
parameters:
- in: path
name: item_id
required: true
schema:
type: string
format: uuid
get:
summary: Returns the details of a task
responses:
'200':
description: A JSON representation of a task
content:
application/json:
schema:
$ref: '#/components/schemas/GetTaskSchema'
'404':
$ref: '#/components/responses/NotFound'
put:
summary: Updates an existing task
requestBody:
required: true
content:
application/json:
schema:
$ref: '#/components/schemas/CreateTaskSchema'
responses:
'200':
description: A JSON representation of a task
content:
application/json:
schema:
$ref: '#/components/schemas/GetTaskSchema'
'404':
$ref: '#/components/responses/NotFound'
delete:
summary: Deletes an existing task
responses:
'204':
description: The resource was deleted successfully
'404':
$ref: '#/components/responses/NotFound'
components:
responses:
NotFound:
description: The specified resource was not found.
content:
application/json:
schema:
$ref: '#/components/schemas/Error'
schemas:
Error:
type: object
properties:
code:
type: number
message:
type: string
status:
type: string
CreateTaskSchema:
type: object
required:
- task
additionalProperties: false
properties:
status:
type: string
enum:
- pending
- progress
- completed
default: pending
task:
type: string
GetTaskSchema:
type: object
required:
- created
- id
- priority
- status
- task
additionalProperties: false
properties:
id:
type: string
format: uuid
created:
type: integer
description: Date in the form of UNIX timestmap
status:
type: string
enum:
- pending
- progress
- completed
task:
type: string

Now that we have the API specification we are in an excellent position to start implementing the API server. If you’re working with another team that has to implement a client application for the API, such as a frontend application, make sure you make the API specification available to all teams in a central location, such as a GitHub repository or URL endpoint. I’ll write another post later on illustrating how you can do that.

To implement the API we’ll use Flask in combination with flask-smorest. Flask-smorest is a REST API framework that uses marshmallow to validate schemas. For illustration purposes we’re going to keep this super simple and the whole app will be in one file, and the list of items will be represented by an in-memory list. In a real app you want to use persistent storage and make sure different components go into different modules. The only dependency you’ll need to get the code working is flask-smorest, so make sure it’s installed. Create a file called app.py and copy the following content to it:

import time
import uuid
from flask import Flask
from flask.views import MethodView
from flask_smorest import Api, Blueprint, abort
from marshmallow import Schema, fields, EXCLUDE, validate
app = Flask(__name__)
app.config['API_TITLE'] = 'TODO API'
app.config['API_VERSION'] = '1.0.0'
app.config['OPENAPI_VERSION'] = '3.0.3'
app.config['OPENAPI_JSON_PATH'] = "api-spec.json"
app.config['OPENAPI_URL_PREFIX'] = "/"
app.config['OPENAPI_SWAGGER_UI_PATH'] = "/docs"
app.config['OPENAPI_SWAGGER_UI_URL'] = "https://cdn.jsdelivr.net/npm/swagger-ui-dist/"
API_TITLE = 'Orders API'
API_VERSION = '1.0.0'
api = Api(app)
class CreateTaskSchema(Schema): class Meta:
unknown = EXCLUDE
status = fields.String(
default='pending',
validate=validate.OneOf(['pending', 'progress', 'completed']),
)
task = fields.String()
class GetTaskSchema(CreateTaskSchema):
created = fields.Integer(required=True)
id = fields.UUID(required=True)
blueprint = Blueprint(
'todo', 'todo', url_prefix='/todo',
description='API that allows you to manage a to-do list',
)
TODO_LIST = []@blueprint.route('/')
class TodoItems(MethodView):
@blueprint.response(GetTaskSchema(many=True))
def get(self):
return TODO_LIST
@blueprint.arguments(CreateTaskSchema)
@blueprint.response(GetTaskSchema, code=201)
def post(self, item):
item['created'] = time.time()
item['id'] = str(uuid.uuid4())
TODO_LIST.append(item)
return item
@blueprint.route('/<item_id>')
class TodoItem(MethodView):
@blueprint.response(GetTaskSchema)
def get(self, item_id):
for item in TODO_LIST:
if item['id'] == item_id:
return item
abort(404, message='Item not found.')
@blueprint.arguments(CreateTaskSchema)
@blueprint.response(GetTaskSchema)
def put(self, update_data, item_id):
for item in TODO_LIST:
if item['id'] == item_id:
item.update(update_data)
return item
abort(404, message='Item not found.')
@blueprint.response(code=204)
def delete(self, item_id):
for index, item in enumerate(TODO_LIST):
if item['id'] == item_id:
TODO_LIST.pop(index)
return
abort(404, message='Item not found.')
api.register_blueprint(blueprint)

You can run the app with the following command:

$ FLASK_APP=app:app flask run

As per the app configuration, you can visit the API documentation auto-generated from the app with a Swagger UI theme under /docs.

Great, now, how do we test this implementation to make sure it’s compliant with the specification? You can use a bunch of different tools and frameworks to accomplish this. Here I’ll show you how you can do it with Dredd. Dredd is an npm package, so to install it run the following command:

$ npm install dredd

To run dredd, execute the following command:

$ ./node_modules/.bin/dredd oas.yaml http://127.0.0.1:5000 — server “flask run”

If you run this now, you’ll get an error saying that we need to provide an example for the item_id parameter in the /todo/{item_id} URL path. You’ll see also some warnings about issues with the specification format, which you can safely ignore as the API spec is valid (you can validate that with external tools like this: https://editor.swagger.io/). Let’s go ahead and add an example fo item_id:

/todo/{item_id}:
parameters:
- in: path
name: item_id
required: true
schema:
type: string
format: uuid
example: d222e7a3-6afb-463a-9709-38eb70cc670d
...

If you run Dredd now again, you’ll get 5 tests passing and 3 failures. If you look closer at the failures, you’ll see that all of them are related to operations on existing resources under the /todo/{item_id} URL path. It seems Dredd is picking up the example ID we provided in the spec and expecting a resource with such ID to exist. Obviously no resource exists until we start running the API, so we want Dredd to actually create a resource first using POST /todo/, fetch the ID of the created resource, and use it to test the endpoints under the /todo/{item_id} URL path. How do we do that? Using dredd hooks!

Dredd hooks offer a simple interface that allow use to take action before and after a transaction. Every transaction is identified by a “name”, which is a combination of different parameters that uniquely identify an operation. To list all names available in your specification, run the following command:

$ ./node_modules/.bin/dredd oas.yaml http://127.0.0.1:5000 — server “flask run” — names

The available names are each of the info blocks listed by the command:

We want the names of the successful actions under the /todo/{item_id} URL path, namely actions that return a success status code such as 200 and 204 (i.e. not those returning 404). Create a file called hooks.y and copy the following content to it:

import json
import dredd_hooks
response_stash = {}@dredd_hooks.after('/todo/ > Creates an task > 201 > application/json')
def save_created_task(transaction):
response_payload = transaction['results']['fields']['body']['values']['actual']
task_id = json.loads(response_payload)['id']
response_stash['created_task_id'] = task_id
@dredd_hooks.before('/todo/{item_id} > Returns the details of a task > 200 > application/json')
def before_get_task(transaction):
transaction['fullPath'] = '/todo/' + response_stash['created_task_id']
@dredd_hooks.before('/todo/{item_id} > Updates an existing task > 200 > application/json')
def before_put_task(transaction):
transaction['fullPath'] = '/todo/' + response_stash['created_task_id']
@dredd_hooks.before('/todo/{item_id} > Deletes an existing task > 204')
def before_delete_task(transaction):
transaction['fullPath'] = '/todo/' + response_stash['created_task_id']

I’ll talk more about dredd and dredd hooks in another tutorial. You can run Dredd with these hooks with the following command:

./node_modules/.bin/dredd oas.yaml http://127.0.0.1:5000 — hookfiles=./hooks.py — language=python — server “flask run”

Now everything passes:

Right, so now we are certain that our API implementation complies with the specification. If another team working on a frontend application has made similar tests, we can be quite certain that both the server and the frontend app will integrate nicely and without errors. Or at least we won’t encounter integration errors due to non-compliance with the API.

Let’s now say that we want to make a change to the API. As it turns, we’d like to be able to assign a priority to each task. This will result in a new field in task resource payload called “priority”, which may have one of the following values:

  • low
  • medium
  • high

The default value will be “low”.

How should we approach this change? We are practicing documentation-driven development, so first we’ll change the specification. Once we’ve updated the specification, we can change both the backend and the frontend. Any change to the backend API that doesn’t comply with the spec will result in a failed test and shouldn’t be released.

The updated schemas in oas.yaml look like this:

...
CreateTaskSchema:
type: object
required:
- task
additionalProperties: false
properties:
priority:
type: string
enum:
- low
- medium
- high
default: low
status:
type: string
enum:
- pending
- progress
- completed
default: pending
task:
type: string
GetTaskSchema:
type: object
required:
- created
- id
- priority
- status
- task
additionalProperties: false
properties:
id:
type: string
format: uuid
created:
type: integer
description: Date in the form of UNIX timestmap
priority:
type: string
enum:
- low
- medium
- high
default: low
status:
type: string
enum:
- pending
- progress
- completed
task:
type: string

If we now run the Dredd command the test will fail:

./node_modules/.bin/dredd oas.yaml http://127.0.0.1:5000 — hookfiles=./hooks.py — language=python — server “flask run”

To fix this we only need to update our marshmallow schemas:

class CreateTaskSchema(Schema):    class Meta:
unknown = EXCLUDE
priority = fields.String(
default='low',
validate=validate.OneOf(['low', 'medium', 'high']),
)
status = fields.String(
default='pending',
validate=validate.OneOf(['pending', 'progress', 'completed']),
)
task = fields.String()
class GetTaskSchema(CreateTaskSchema):
created = fields.Integer(required=True)
id = fields.UUID(required=True)

If you run the Dredd command again, the tests now pass. In this example, updating the marshmallow schemas is sufficient to update the app. In a real application, you’ll be persisting data to a database and you’ll need to run some migrations to update your models as well.

If you liked this post, don’t forget to clap!

And if you want to learn more about API integrations, have a look at my new book Microservice APIs in Python”. It’s just been released through the Manning Early Access Program (MEAP), which means it’s still under development and you get a chance to give your feedback and request new content before the book is in print. Access to the book: http://mng.bz/nz48

If you’re interested in the book, you can use the code slperalta to get a 40% discount. You can also download two chapters for free from the following URL: https://www.microapis.io/resources/microservice-apis-in-python.

I look forward to your comments!

--

--