MVP OpenAPI spec for CKAN API — lessons learned

CKAN (open data portal) API documentation is good and extensive. It’s too extensive to the first touch and “landing page” for new consumers.

Customer wanted us to lower the learning curve, define OpenAPI spec for it and put it in our API management platform (which they are using). What the heck! Sounds like a good job. Little did we know.

This is mostly a story of failures, swearing, sweat and even tears. But most of all, it’s a story of punch of apitalists not remembering one thing: we are not dealing with REST API. We were using REST API tools to define non-REST API.

Forgotten truth: we are not dealing with REST API.

So we started doing MVP OpenAPI spec (swagger) for CKAN action API. We looked for existing machine readable description without any luck. We found some spec that looked first like ready-made solutions, but closer look revealed that swagger name was deceptive. So, we had to do all from scratch.

What is CKAN?

Before going deeper in the story of “swaggering” one API, let’s have a look at what CKAN is.CKAN, the world’s leading Open Source data portal platform. CKAN is a powerful data management system that makes data accessible – by providing tools to streamline publishing, sharing, finding and using data. CKAN is aimed at data publishers (national and regional governments, companies and organizations) wanting to make their data open and available. CKAN is open source, free software. This means that you can use it without any license fees, and you retain all rights to the data and metadata you enter. Being an open source project, CKAN and its extensions are developed by a large community of people.

What to include?

Just MVP. We discussed the functionality that API consumers would need to get basic operations done. Based on that we added stubs in OpenAPI spec for the functions we saw fit: add, update, remove, list datasets and same for organizations. This was after going to be only MVP spec and to satisfy specific needs.

GET was easy

We started to define OpenAPI spec for the generic action API of CKAN platform from GET functionality. What we did was read from CKAN API documentation about each method and “copy” parameters (not all, just absolutely necessary for our case) to OpenAPI spec. And yes, this was the first time we noticed that it’s not REST API. Did we remember that when we went forward? No. It did always feel somehow wrong, but we kept on going. The paths looked unREST (/package_list etc).

Add authorization

To get list of users we had to add authorization to the specification. That was easy. Just define some security elements in spec and name one key Authorization. CKAN does not use authentication. Security is based on just passing authorization key in the header. Not the best possible solution in my opinion. I would have done OAuth based solution. But hey that’s not my problem now. My task is to get this OpenAPI spec done and in use.

This was needed to list users in CKAN. Testing the API call from curl worked after a few tries. It did not work in SwaggerUI. There was something weird going on with passing the key to API but one guy started to look at the problem while others went forward. By the way, did I already mention that we again forgot that CKAN API is not REST API?

Lets add POST to create dataset

Hmm..ok what next? Let’s add POST functionality so that we can add datasets to CKAN. This is the moment when everything started to go downhill. It was not just authorization but browser preflight requests (OPTIONS) and proxy in use that did not seem to work well together. It took some time (days) to figure out why API-keys (required by APInf platform) and authorization key (required by CKAN API) were not passed back and forth as they should be. We had to configure proxy to handle headers correctly. Details about this can be found from another post written by my team mate. That was solved and keys started to flow. Yay! Now we are getting forward!

Let’s try the delete! That should be easy now that authorization is working. Oh, by the way CKAN API is not REST. We did not seem to be able to keep that in mind.

DELETE is not working — why?

Next task was to add DELETE to remove dataset from CKAN. After adding the description to OpenAPI spec we uploaded the new version to APInf and started testing. Aargh! Not working! What now! Of course we tried the operation from curl as well. No luck. We dropped the proxy from between and tried to operate directly with the API. No luck.

Then the CKAN instance maintainer provided example that should work:

I said hold on! This is using POST not DELETE! Then the same CKAN platform maintainer said the magic sentence in Flowdock: “It’s not REST API. It does not know anything about DELETE, PUT or PATCH. It only operates with GET and POST”. Ok, it’s not REST API

I modified the curl to use proxy and both keys (included platfom API-key) and changed method to POST.

Did it work? Yes it did. Dataset was removed.

Should we continue?

At this point we transferred the decision about next actions to customer. Should we continue or not? After long-ish discussion we came to conclusion that yes we will continue. This is what we were planning to have initially:

In the discussions it also became clear that previous CKAN API (v 2) was REST, but for some reason they decided to ditch REST approach in development.

We accepted the fact that color coding is lost and therefore some consumers might be a bit confused at first (major hit for DX). To avoid the misunderstanding as much as possible we decided to put clear note to beginning of OpenAPI spec that DELETE and PUT methods are replaced with POST due to implementation of the CKAN API.

Doing the OpenAPI spec even with such limitations will offer lower learning curve for API consumers to onboard API than to dive directly to CKAN API documentation. In other words, this OpenAPI spec served via SwaggerUI is the “getting started” with CKAN API.

Work is done for MVP

So we did convert all DELETE and PUT to POST methods. We froze the OpenAPI spec finetuning for a while and published the YAML version in Github. Reason for freeze is that the datamodel discussion behind it takes time to reach goals and freeze datamodels used.

On top of that we added some notes in the beginning of the OpenAPI spec file about this unorthodox use of specification. If you are considering to reuse the OpenAPI spec file, keep in mind that the datamodels are quite often installation specific.

We also added a few examples of API calls to get started and information about authentication and API keys.

Comments to CKAN API OpenAPiI spec

All comments are welcome and you should use Github this. CKAN API OpenAPI spec file in YAML format is there. Use issues feature to raise discussion.