Building an API Token system from scratch

With the recent trend in microservices and service based architecture, APIs have taken a front seat for attacks of different kinds. OWASP has even created a list of top security threats to APIs and it is quintessential to create a best in class API security system which is constantly updated to protect itself from such threats.

Token based authentication is one of the many steps that API security systems utilize to safeguard their resource APIs and this article describes some of the fundamental steps to consider while building your own api token system.

So what is an API?

There is an excellent article which explains what an API is and its responsibilities. API is simply an interface which is setup to communicate with applications instead of end-users. This definition itself poses an interesting problem on how to distinguish between good applications and bad applications and more importantly how to prevent bad applications from accessing your sensitive resources (APIs).

What is OAuth 2.0?

From the time of SOAP and XML based services, there were multiple authentication paradigms and specifications which help formulate a path to define and secure APIs. One such standard is OAuth 2.0 which is a framework in comparison to its older counterpart OAuth 1.0 which is a specification. Being a specification and due to its nature of secure artifacts usage and data exchange, OAuth 1.0 was cumbersome for implementation on mobile devices and other API consumers.

OAuth 2.0 provides a more robust framework around which Identity providers, Resource owners and applications can communicate between each other without the need for the resource owner to share credentials to the application which is one of the biggest anti-patterns observed in an API-Application interaction.

API Security Pipeline

A typical API security pipeline has multiple filters and levels of security built on top of each other. Some of the commonly found pipeline elements are Authorization, Rate limiting, Resource Caching etc., However, all these security measures depend on “Authentication” to be performed accurately and with non-repudiation.

A Simple API Security Pipeline

For example in spring security, the API endpoint authorizations are configured using hasPermission method which depends on the authentication pipeline to identify the correct user and inject the appropriate and associated roles.

So it is extremely important to design a token system architecture which can support extensibility and facilitate other API security needs.

So what is a token?

A token is a piece of data which only a specific authentication server could possibly have created & contains enough information to identify a particular entity or entities and are created using various techniques from the field of cryptography.

Particular Entity — A token should always be carrying two type of entities within their identity subjects — an User and an Application. A token cannot/should always be tied to an Application entity and a token should alway carry the identity of the Resource owner who authenticate and accepted the permissions to share information and access to the resources.

Cryptography — All security systems uses Cryptography techniques in one form or the other. Either when it comes to sharing an application secret with application developer, verifying the integrity of the resource data and so on. However, one of the key techniques that are used in token generation and validation for API access is called Private-Public Keypair which is utilized to enforce integrity checks of the token and ensure that the tokens have been indeed issued by the intended authentication server.

Public Private keyPair usage for Integrity Check

Authentication Server

The authentication server is the key system in place which is responsible for generating the tokens and validating them when presented to the resource APIs.

Application Lifecycle Management

Since the token system functions with an Application entity, it is essential that the lifecycle of the applications are embedded within the authentication server. For example, when a new application is registered, it is in a “Registered” state and when tokens are being generated, the application moves into an “Active” state. Sometimes the application might be “Blocked” due to erratic and risky behavior, calls beyond permitted bandwidth and so on.

Application lifecycle state diagram

Token Lifecycle Management

Similarly a token too goes through different stages in its lifecycle. For example, when a user consents to an application to access the APIs on their behalf, the token are generated as the result of the consent. However, the “access tokens” might expire soon and need to be refreshed using the “refresh tokens” and if the user revokes the consent, the tokens are revoked of their access to the APIs as well. It is crucial for the authentication server to manage this lifecycle within its system for token generation, refresh, validation and revocation cycles.

Token Lifecycle State Diagram

What is the difference between a Refresh and an Access token?

An “Access Token” as its name implies provides access to an API. This is the token that is sent to the resource server to obtain information or perform an action on behalf of the user. These access tokens are usually short lived to ensure that any MITM (Man-In-The-Middle) attacks does not expose long lived access.

A “Refresh Token” on the other hand represents the consent provided by the user and is utilized to obtain a new “access token” when the current access token expires. The refresh tokens cannot be used to access any resource API and can be utilized only to generate a new access token.

Token Structure

The second component of the authentication server is creating the token itself ie., building/constructing the token in a well-thought out and planned manner. Some of the commonly used identity providers and their sample token representations are given below.

# eBay
AgAAAA**AQAAAA**aAAAAA**E6+EWg**nY+sHZ2PrBmdj6wVnY+sEZ2PrA2dj6wMkIGkCJCGoA2dj6x9nY+seQ+/5wK1dskM5/3EOEY7BDg7VHK/CmDimCvVPbtJankHhzJUF8rU876Qzjs
# Google
ya29.GltiBRICgroWhf0XJ-e4nYpzc9UG0Fn_Ghq06_yg3BDZ4EHM_X8rIirEnFUJVb9uawqW2tE9yqfT0KwcaEXLKp7VFpde5v
# Facebook
EAACEdEose0cBAJyrAOqIWCAPVobbylB7mZB7X3L0x5BLBosAAm2BDdUnhYKSp7VM9Tpyi8EhrAD6ZBYZBtymYC5ZBxNv1XrCBngEi0gEWLejezZb0gkArZBkJWcFiVjGcKYy44EY8ZD

None of them look alike, though most of them might be following some of the common token representation standards available such as SAML and JWT.

How does a JWT token look like?

Json Web Tokens (JWT) are one of the most commonly used standards for token representation. A typical JWT has three parts separated by a period ‘.’ symbol — Header, Payload and Signature.

JWT Token Representation — from http://jwt.io

The header represents the algorithm used to sign the token payload while the payload contains the claim or subject information that the token represents. The signature is either generated using a shared secret (symmetric key) or using a private key secured within the authentication server and the token’s authenticity can be verified by the equivalent public key exposed by the authentication server.

JWT are also commonly used in OpenID connect specifications and its implementation can be found in Google, eBay etc.,

What goes in the claim?

A good analogy for determining token contents is to that of web session cookies. A web session cookie contains every bit of information (either on the client side or on the server side) about the user access, authentication information which is utilized on every web call to the server to validate the veracity of the invocation and if the web interaction truly belongs to the authenticated user.

Tokens should carry the same tenacity in its claim structure to capture every unique identifier possible to help establish trust on the API server side during invocation of the API resource. It is crucial to capture and evaluate both incidental (user and app entities) information and associative data (ip-address, time of access etc.,) during API access and cross-reference to identify risky patterns, similar to web access to prevent fraudulent activities.

But then why does the tokens issued by identity providers such as Google, Facebook or eBay does not look like a JWT or SAML token. To find out, please continue reading the Part 2 of this series.