What 15 minutes APIOps Cycles Audit tells about your API?
This is a tutorial on how to use the API Audit in APIOps Cycles. I was requested to do this by an API Product Manager at a telco company, as they are just creating their own API development guidelines using APIOps Cycles as a basis.
This video (and this blog, which is the contents of the video minus the pictures) has content I typically use with my consulting customers and also with our APIOps Cycles subscribers. If you need information, help or inspiration with your API-related efforts, be sure to subscribe to a plan at apiopsycles.com and tell a friend, too.
I’m going to use the API audit with a real API. It’s just something random I picked from the internet related to food and nutrition with a machine learning algorithm. This is an API I haven’t used before, to keep things more interesting.
I picked this EDAMAM API because for the first 5 seconds, it looked like a quite well-designed and managed API. The story here is that I’ll use the API audit like a buyer would inspect the goods they are about to buy. This is one use case fo the API audit. The other is to use it when building or changing an API as an API provider.
If you’d like us to do something similar with your API, don’t hesitate to chat with us on our website.
Let’s look at the API audit page and instructions in the APIOps Cycles page. You see, it’s right after the “Build” phase, meaning that usually the API has been built before the Audit, but not yet published. In this case, we have a published API in our hands because we are buyers.
The purpose of the API audit is to make sure the API is publishable, manageable, supportable, safe, usable and well-designed, so it’s easily learnable. We can say that overall if an API passes API audit, it has good developer experience as well as good architecture.
Going through the checklist
“API is published via API management” is true, I noticed it by checking a few items in the website code, it wasn’t directly visible, except in the documentation where some of the parameters were documented as “3scale id” etc.
“API is visible in a Developer portal” is true.
“API can only be accessed via API management gateway.” This we won’t know unless we try the API and it leaks some internal addresses where we can try our luck. This is a surprisingly common mistake even with experienced developers when using API management.
“Rate limits are enforced when requesting API.” We won’t know for sure, but as they are using API management and they are listing the rate limits as part of the subscription plans, I’m sure that’s covered.
“The specification is maintained automatically when changes are done to API.” That we won’t know for sure. The documentation doesn’t look like it’s relying on OpenAPI, so it could be published automatically as part of continuous integration. It has more manually written text and examples. From the source code of the page, I can see that the examples are in fact, Github gists.
“Specification for endpoints is validated on every change against the standard specification.” This means that if OpenAPI, RAML or some other standard is used, the interface description of this API is validated to be according to that standard. The problem is that if it’s not, there is a very good chance the API publishing via the API management will fail. It might fail even then because sometimes the API management platforms are installed with a particular version which supports a particular version of that standard. So you need to always know which tools and which versions you are dealing with.
“The specification contains schema for the requests and responses.” Doesn’t look like that, no JSON schema to be found in the documentation. This means that API consumers must make sure they do their own validation but also that the API provider needs to pay extra attention when making changes to the implementation that they don’t break the interface. Automated testing is so much simpler to do when using schema. At least when it’s used correctly and the endpoints are designed smartly.
If we look closely at the examples, we’ll see something interesting. All the concepts have an ontology reference with Owl. But that would only help in “understanding” the meaning or relationship of, for example, the measurements used and validating that the value is valid and found in the ontology, not validating the full data-set.
“Request and response schema and examples are validated for format and examples pass the schema validation.” Again, this is false, as we don’t have the schema here. Really often we find when looking at our customers or 3rd party APIs that the examples are wrong, format-wise. Sometimes, of course, include also incorrect data. This makes it very hard for API consumers to start using the API.
“The API uses HTTPS (or in special cases other stateless protocol with encryption)” is true and important for security. Even websites should use HTTPS nowadays.
“The API is published under the organization’s official domain” is true, this is important for branding, looking reliable and remembering the address of the endpoints for humans, but also important when handling SSL certificates, firewalls and other network security aspects.
“The visible domain is shared with other APIs (i.e the domain the API consumers see)”. This is true. All three APIs use the same domain, only the final path of the API is different. This is again a developer experience thing as well as manageability. Sometimes APIs that are implemented on microservices have sub-domains that are generated automatically and this causes headaches for network management as well as developers if it’s visible even after the API is published via the API management
“The endpoints are maximum 2-resources deep (Example /projects/123/tasks/345)” in this case the endpoints are not modelled as individually identified resources but used more for complex searching with multiple identifiers and keywords. They could have also done an endpoint for each identifier, but instead, have chosen to expose the parser of their search engine as the one endpoint. This is most likely a good decision for their intended value propositions, as shown in the Showcases tab, for example, food diaries.
“Other naming styles in style guide have been applied.” We don’t know if they have their own style guide for designing APIs, but there are some irregularities with the API design here in terms of the APIOps Cycles default design guide. The resource names are one thing, but otherwise, nothing major, just that some of the attribute names have been shortened. This is sometimes a chosen strategy to keep the URL length shorter, at least in the endpoints that use GET -requests with search criteria as query parameters. While the browsers can nowadays handle extremely long URLs, there is bound to be some layer somewhere in code or network that doesn’t like super long URLs.
With the Full Recipe analysis endpoint, the values of the attributes can be quite long text, too. It’s very common to design this type of API with POST -requests, especially in this case when the result is a natural language and ingredients based nutritional analysis of the submitted recipe and the request and results are cached. A good example of fewer parameters used with a GET request, as has been done in the Food Text Analysis endpoint.
“API has versioning”. Not visibly. They are using an API management solution, so the versioning is possibly implemented inside the API management solution and controlled by the subscription mechanism.
This goes to the next point: “Versioning strategy is best for the selected API management platform and for the primary API consumers?” Yes, probably true in this case.
“API uses stateless processing (no sessions, OpenID connect tokens are ok)”. Looks like that based on the documentation. What you request is what you will get, each time and the API backend won’t remember what you did before. This is good, less room for inconsistent behaviour and data.
“There is no special processing (asynchronous events)”. True.
“GET -requests don’t have request bodies”. True, this is according to standard. Not all API clients support GETs with bodies, as it’s not standard implementation according to the RFC.
“POST is used for creating and updating data?” Not applicable for this API
“POST is used only in standard ways” In away relevant, as mentioned they do use POST for the very complex and variable-rich queries. This is non-standard but very commonly practised and a good choice if you really must do a complex search. But you should always think about the real need to do that carefully, because it may result in complex logic, access rights and performance issues.
“PUT is used to create or replace the entire resource?” Not applicable
“DELETE is used only to remove a resource?” Not applicable
“404 is used for the wrong URL.” Yes, according to the documentation
“400 -responses have additional information of the specific error (for example missing required attribute).” Interestingly, they haven’t documented 400 status code to be used at all. So possibly they don’t validate the query parameters. I’m assuming the result is just 404 or 200 if there is nothing as a result because of wrong parameters. Here we have a potential problem because 400 nor 204 status codes are not used.
Now, they do use 422 for unprocessable entities. This is interesting, and most likely chosen because they are handling natural text and trying to grasp the meaning. Sometimes the text may include for example characters that are not possible to parse. Let’s get back to this part later when we are dealing with more OWASP security issues.
“401 -response is used when API consumer is using the wrong credentials.” According to documentation, it isn’t used but as this API and it’s credentials are managed by an API management solution I’m betting that the response is actually 401. Let’s try. Yep, I was right. This is included in the OWASP security checklist criteria and it’s important that your API backend response with 401 to the gateway, too if the credentials are not correct. It’s also a usability issue to understand what went wrong. If the API is authenticated with a token-based system and uses refresh tokens validated at the gateway, the 401 code also tells the API management to try to refresh the token.
Another OWASP criteria: “403 using endpoint which is valid but not accessible by the requesting API consumer or trying to use operation they are not allowed to do. “It’s important to make sure that any endpoints the authenticated user is not allowed to use are secure and respond with the correct status code. According to the documentation of this API, this status is not used. As they have grouped endpoints belonging to the same subscription product into a separate API with its own subscription, this is probably not relevant.
This item in the audit is an example where the Collecting requirements -phase of Minimum Viable API Architecture uses the Business Impact -template, which maps directly to the API implementation and this Audit checklist. So be sure to check that part the APIOps Cycles method.
“500 -response when there is an internal processing problem which API consumer can not fix by changing the request.” This status code is used almost as default in most server and gateway implementations. It’s not documented in their API documentation, but most likely would happen. The problem usually is that APIs respond with 500 to all errors or at least too many errors and even those which are the API consumers fault, where 400 would be more appropriate.
“500 -responses have application-specific error code but not a very clear plain message about the exact error (stack trace or error text) which could expose internal implementation to API consumer.” I won’t try to break this API right now, but hopefully, this is ok. This is part of OWASP requirements. If the error message exposes stack trace or too many details about what the API is implemented on and what exactly went wrong, it could provide a hacker details on how to improve their attack and use known vulnerabilities.
“The GET-request: 200 OK and items -array as an empty array. “This is one of the success status codes items, there are more listed here. The API in question offers only 200 OK for anything that is not clearly a mistake.
“The GET-request: 204 empty response, nothing in the body.” So no 204 for empty responses. What if the ingredients submitted don’t have any nutritional data associated with it? I don’t know if there is any situation where the response would be totally empty. If that would happen I wouldn’t know if it was my mistake or totally ok.
“POST: 200 OK for updates or submits without creating new resources.” This is ok.
“201 -response is combined with the identifier of the created resource.” Not applicable.
“DELETE 204 OK when removing resource was successful.” Not applicable
And what was the result?
This API is clearly quite well designed, documented and managed, although there is also room for improvement to reduce the possibilities in breaking the API while it’s undergoing further development. Also, a few issues would make new developers journey a bit easier. Overall, a very good achievement so far.
But as you can see there are a few more interesting items left on the checklist.
These are the ones that you kind of need to have the experience to truly value someone’s experience and tips on them. Both areas in detail will be covered in the next APIOps Cycles Community, Pro and Partner newsletters available for subscribers at end of 07/2019. And you’ll get some other goodies, too so be sure to subscribe. And of course, subscribe to our YouTube channel, too.