Token gang (Bearer token, Reference token, Opaque token, Self-contained token, JWT, Access token, ID token, Refresh token)

iamprovidence
16 min readJun 27, 2024

--

Nowadays, token-based authorization is so popular, that it is safe to say it dominates the field. Despite that, for unprepared developers who have just started learning about it, it is quite easy to become overwhelmed by the extensive range of definitions and terminologies. In this article, I will try to clarify everything for you.

Today, you will learn about Bearer tokens. The differences between reference and self-contained tokens. You will delve into the concept of JWT. For the final, you will explore how tokens are categorized based on their usage, such as Access tokens, ID tokens, and Refresh tokens.

Remember there are other articles in a series.

You can read about Authentication theory here:

How to implement authentication in C# environment:

You may also check out some others stories:

  • What is authentication schema in ASP about?
  • Authorization policy under the hood
  • Encapsulate authentication with DelegatingHandler
  • How to parse token on client side

If are are ready, let us begin.

Bearer token

There are plenty of algorithms to implement authentication in the project. Those algorithms are called schemas. Even though they work differently, they have something in common. Most of the time, the client should include an Authorization header in a request with the schema name followed by a schema’s parameters.

For example, in Basic authentication, the client adds the word “Basic” followed by a space and a base64-encoded string of the form “login:password”:

HTTP: https://my-app.com/get-resource

Authorization: Basic base64encode(login:password)

In the Digest scheme that would be the user’s name and hash of the password:

HTTP: https://my-app.com/get-resource

Authorization: Digest username="john", hash="5ccc069c403eb171e9517f40e41"

ApiKey schema requires an API key as a parameter:

HTTP: https://my-app.com/get-resource

Authorization: ApiKey 9c403eb171e9517f40e415ccc069c403eb171e9517f40e41

We are interested in a Bearer schema. The headers, in this case, will have the next look:

HTTP: https://my-app.com/get-resource

Authorization: Bearer <token>

Where <token> is an arbitrary string that can be used for authorization. The token used in the Bearer schema is called a Bearer token.

There are three types of tokens:

  • reference token (also known as opaque token)
  • self-contained token
  • hybrid token

Let’s discuss them in more detail.

Reference token (Opaque token)

Reference token (aka opaque token) is just a random string, that references a user’s session in a database.

Let’s see how it can be used:

  1. A user enters his login and password in the login form
  2. If the server successfully validates those, it creates a user’s session in the database and issues a random string related to this session called a reference token
  3. User Authentication. The client stores that token and includes it in all subsequent requests:
HTTP: https://my-app.com/get-resource

Authorization: Bearer pS4zFXYx0URtDEkMWOcGunMgxA7cGyXYx0URtDEkMWOcGunMgxA7

4. Server Verification. The server, upon receiving the client’s request, validates whether there is a session in the DB with such token. If so, the access is granted

There are few advantages of using opaque tokens:

  • session management. all active user’s sessions can be listed
  • revocation. in case the opaque token is stolen by intruders, the server can terminate a user’s session
  • flexibility. the server can update the user’s permission at any time
  • enhanced security. since the token itself doesn’t contain any sensitive information, the risk of data exposure is reduced. Even if the token is intercepted, an attacker won’t be able to extract any useful information from it
  • payload size. since reference tokens are just identifiers they usually are small in size, leading to less data transmitted with each request

But with next disadvantages:

  • reduced client-side transparency. since the reference token is just a random string, it can not be decoded, as a result, the clients have limited visibility into the token’s contents. This also makes debugging and troubleshooting more challenging
  • performance. token validation costs a DB request. For distributed system like, microservices, it is even worse. You need an http call to an identity server that will do a DB request
  • memory. tokens and client session information (permission, expiration time, etc) should be stored in the database

Self-contained token

A self-contained token is a type of token that contains all the necessary information within itself.

The authentication flow is as follows:

  1. A user enters his login and password in the login form
  2. If the server successfully validates those, it issues a self-contained token
  3. User Authentication. The client stores that token and includes an Authorization header in all subsequent requests with the word “Bearer” followed by a space and the token:
HTTP: https://my-app.com/get-resource

Authorization: Bearer xA7cGyXYx0URtDEkMWOcGunMgxA7pS4zFXYx0URtDEkMWOcGunMg

4. Server Verification. The server, upon receiving the client’s request, validates the token, its signature, and grants access

Even though the token look alike to the reference one, there is a crucial difference. The self-contained token can be decoded and the client can read it’s content. Additionally, the server does not need to store anything, since the token contains all information within itself (hence the name, self-contained).

The benefits are:

  • performance. token’s validation does not require database lookup since the token is self-contained. This is especially useful in the distributed systems like microservices
  • memory. tokens do not require to be stored in the database
  • stateless: the server does not need to store session information. This simplifies server-side implementations and allows for easier horizontal scaling

But remember about those issues:

  • session management. there is no way to list active user’s sessions
  • revocation. the server can not end the user’s session (self-contained token is valid even after a user logs out)
  • limited flexibility. since the token content is fixed at the time of issuance, updating or modifying the token structure may require issuing a new token
  • sensitive data exposure. care must be taken not to include sensitive information in the token payload since it can be stolen and decoded by intruded
  • payload size. self-contained tokens payload may grow potentially increasing network overhead

As you can tell this is the exact opposite of a reference token.

The token we use is hard to comprehend by our human minds. This is because the token is encrypted to make it more size and concise.

xA7cGyXYx0URtDEkMWOcGunMgxA7pS4zFXYx0URtDEkMWOcGunMg

In case you know the encryption algorithm and token format it can be decoded.

There are many formats of a self-contained token. Let’s just see a couple of those:

  • Simple Web Token (SWT) — in this format data is stored as key/value pairs separated by ampersands.

The decoded SWT will look something like this:

Issuer=http://auth.service.com&
Audience=http://myservice.com&
ExpiresOn=1435937883&

UserName=John&
UserSurname=Doe&
UserRole=Admin&

HMACSHA256={signature}
  • Security Assertion Markup Language (SAML) — is an XML-based token:
<saml>
<saml:Issuer>http://auth.service.com</saml:Issuer>
<saml:Audience>https://idp.example.com</saml:Audience>
<saml:Subject>
<saml:Name>John</saml:Name>
<saml:Surname>Doe</saml:Surname>
<saml:Role>Admin</saml:Role>
</saml:Subject>
<!-- additional information -->
</saml>
  • JSON Web Token (JWT) — this formats rely on JSON:
{
"alg": "HS256",
"typ": "JWT"
}
.
{
"iss": "http://auth.service.com",
"aud": "http://myservice.com",
"name": "John",
"surname": "Doe",
"role": "Admin"
}
.
signature

As you may guess, the most popular format is JWT, so that is the one we going to talk about the most.

Hybrid token

Before talking about our prom queen JWT, just a few words about the Hybrid model.

This flow is a combination of reference and self-contained tokens inheriting their advantages and disadvantages.

The server issues a self-contained token and also stores the session in the database.

It is rare to find a hybrid model, but it is still useful in some cases.

For example, your server validates a self-contained token for most of the actions. However, for sensitive operations, like money processing, it is crucial to make sure the token is still valid and not revoked. In that case, the server can sacrifice the performance and do a Database lookup to validate the token.

JWT

A well-know JWT 😌. Let’s see what you’re made of.

JWT (JSON Web Token ) — is a cryptographically secure self-contained token that stores information in JSON format.

JWT is composed of three parts separated by dots:

  • header
  • payload
  • signature
header.payload.signature

In encoded form it will look like this:

eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiaWF0IjoxNTE2MjM5MDIyfQ.SflKxwRJSMeKKF2QT4fwpMeJf36POk6yJV_adQssw5c

But when decoded, like this:

{
"alg": "HS256",
"typ": "JWT"
}
.
{
"sub": "1234567890",
"name": "John Doe",
"iat": 1516239022
}
.
signature

Let’s discuss each part individually.

header

header — is a JSON object that contains information about the token and the signing algorithm:

{
"alg": "HS256",
"typ": "JWT"
}

There are two properties:

  • typ —the type of token (which is JWT)
  • alg — the signing algorithm being used to create the signature (which is HMAC-SHA256 in our case). You will see it later

payload

payload — is a JSON object that contains key/value pairs of some information about the user, like his id, name, phone number, address, etc. Those pairs are called claims.

{
"sub": "1234567890",
"name": "John Doe",
"iat": 1516239022
}

There are two types of JWT claims:

  • registered. standard claims defined in the JWT specification to ensure compatibility with third-party, or external applications. Registered claims usually contain information about token
  • custom. arbitrary values, defined by a developer. Custom claims usually contain information about a user (name, phone number, etc)

Some common registered claims include:

  • iss (issuer) — indicates the issuer of the token (usually URL of the server which issued the token, http://auth.service.com)
  • aud (audience) — recipient for which the JWT is intended (usually URL of the server where the token will be used, http://myservice.com)
  • sub (subject) — identifies the subject of the token (usually the user id)
  • exp (expiration time) — specifies the expiration time of the token
  • iat (issued at) — indicates the time at which the token was issued
  • nbf (not before) — specifies the time before which the token must not be accepted for processing
  • jti (jwt id) — indicates a unique token identifier

Those claims are not required but can be useful.

You can also add your own custom claim. It is a common practice to add information that you need to access almost in every http request in the claims, like (the user’s name, permissions, tenant id, etc). It is faster to read it from the token than to make a DB lookup. Be careful, not to include anything sensitive there, like the user’s password.

{
"name": "John Doe",
"role": "admin",
. . .
}

⚠️advanced section⚠️

JWTs are self-contained, it is very hard to revoke them, once issued and delivered to the recipient. Because of that, you should use as short expiration time for your tokens as possible. Avoid giving your tokens an expiration time in days or months.

Remember that the exp claim, containing the expiration time, is not the only time-based claim that can be used for verification.
The nbf claim contains a “not-before” time. The token should be rejected if the current time is before the declared nbf claim time.
Another time-based claim is iat — issued at. You can use this claim to reject tokens that deem too old to be used with your resource server.

When working with time-based claims remember that server times can differ slightly between different machines. You should consider allowing a clock skew when checking the time-based values. This should be value of a few seconds, and it is not recommended to use more than 30 seconds for this purpose, as this would rather indicate problems with the server, rather than a common clock skew

signature

signature — is encoded part of the token that guarantees its validity

The signature can be calculated with the next code:

var SECRET_KEY = "a-random-string-only-server-knows-cAtwa1kkEy";

var unsignedToken = base64UrlEncode(header) + '.' + base64UrlEncode(payload);
var signature = HMAC-SHA256(unsignedToken, SECRET_KEY);

Where HMAC-SHA256 is the signing algorithm defined in the header and SECRET_KEY is just a random string stored somewhere in the server. It is used to both create and validate the token.

⚠️advanced section⚠️

It is common to share secret with applications you trust, so they can validate the token themself

HMAC-SHA256 is a symmetric encryption algorithm that requires one key to create and validate the signature. Use it if the secret is shared with first-party applications. It is simpler and faster

You can also use RSA, which is an asymmetric encryption algorithm that requires two keys: public and private. With private you create a signature, with public you validate it. Use it if the secret is shared with third-party applications. Keep in mind, that it is more complex and computationally intensive

Finally, we can create a JWT token:

var jwtToken = 
base64UrlEncode(header) + '.' +
base64UrlEncode(payload) + '.' +
base64UrlEncode(signature);

It is important to understand that anybody can decode the JWT since it is encoded with a well-known Base64 encoding mechanism. No sensitive data should be put in the claims.

However, signatures guarantee that the token cannot be forged. If the token is leaked, the intruded can use it, but cannot change its content, for example, setting a different name or permission. Because it will result in a different signature, making the token invalid.

This is why SECRET_KEY should be stored safely. If it is leaked anyone can create a valid token.

⚠️advanced section⚠️

Changing SECRET_KEY will invalidate all previously issued tokens because the signature verification will fail

Before moving to the next section, you should know about a vital websites for working with JWT. It is jwt.io. You can decode or create any token there.

Access token

Tokens can be used for authorization.

Authorization (AuthZ) — is the process of defining what permissions a user has

So when a client sends a request like this, a server can determine whether the client has permission to view data or perform actions.

HTTP: https://my-app.com/get-resource

Authorization: Bearer <token>

Token used in authorization called access token. An access token allows access to an API resource.

There are no strict rules or standards saying which type should be used for access tokens. It can be any: opaque or self-contained token. It can be in any format JWT, SWT, custom format, etc.

⚠️ be cautious⚠️

When integrating with external systems, the access token’s implementation may and will vary. It is not always JWT.

In theory, the client should not care about the type of token and should not attempt to decode it. The access token is intended to be understood only by the server, which uses it to retrieve authorization information such as permissions, roles, etc.

However, in practice, the same developers who are building the client are also building the server 🙃. And most likely JWT is used as an access token. Often developers will include user’s information there, so the client application can decode it when needed.

Even though, this way access token can be used for authentication, it is not the desired intent.

⚠️advanced section⚠️

Alternatively, instead of decoding the access token on the client, the developers will add an endpoint to return the current user.
This allows the server to change access token implementation without affecting the client app.

HTTP: https://my-app.com/get-current-user

ID token (Identity Token)

Tokens can be used for authentication.

Authentication (AuthN) — is the process of defining who a user is

Upon entering the user’s credentials, apart from the access token server also issues the ID token to the client.

HTTP: https://my-app.com/log-in

Response:
{
"access_token": "...",
"id_token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiaWF0IjoxNTE2MjM5MDIyfQ.SflKxwRJSMeKKF2QT4fwpMeJf36POk6yJV_adQssw5c"
. . .
}

Token used in authentication called ID token. An ID token contains information about the current user.

Unlike access tokens, ID tokens have strict standards saying that it should always be JWT. Moreover, they are meant to be inspected and used by the client application.

{
"firstName": "John",
"lastName": "Doe",
"age": 18,
"profilePicture": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
. . .
}

It is rare to find the ID token being used in small applications. However, it is an essential part of the OpenID protocol.

Refresh token

A refresh token is a special token that is used to obtain new access or ID tokens.

When a user authenticates, they receive an access token (and an ID token), as well as a refresh token. If the application needs to access resources again and the previously provided access token has expired, it uses the refresh token to request a new access token.

It is common for the access token to have a short lifetime (something like 30 minutes), while the refresh token can have a long lifetime (up to 30 days).

Refresh tokens are typically some random identifier stored in the database and get removed after use:

s6adsa-73a708f8-bd38-4c0e-97ee-2b27ca44d-z9xc4c0e-97ee-2b27ca44d-z9xc-za87g

People usually struggle with understanding the need for refresh token:

Why two tokens? 🦧
Why access token is not enough?
What is the point of a refresh token?

Honestly, I have seen multiple applications working totally fine without a refresh token 😁. But, there is always a big BUT 🍑.

Refresh token is all about security. Limiting the expiration time of an access token enhances reducing the potential impact of a compromised token.

Imagine the next scenario:

  • 1️⃣
    - an access token without expiration time has been stolen
    - no refresh token available

In the case of a self-contained token, an attacker would have infinite access to the server because you can not revoke self-contained tokens. In the case of reference token, the attacker would have access to the server until the user logs out (but some users never log out). 😨

This is why it is important for access tokens (both reference and self-contained) to have a short lifetime. So if the token is compromised, the attacker would do as little damage as possible.

However, just having tokens with a short lifetime still causes some issues. So let’s go to the scenario number two:

  • 2️⃣
    - an access token with a finite expiration time has been stolen
    - no refresh token available

An attacker would have limited access to a system. As soon as the token expires, our dirty intruder can no longer harm us. 😎

Our regular user will also have the access token with a short lifetime. He will be forced to re-login every time it expires. 😨

Not the most pleasant user experience 😒. Therefore we need a refresh token. It will allow us to update the access token transparently for a user.

Let’s see it in the scenario number tres:

  • 3️⃣
    - an access token with a finite expiration time has been stolen
    - a refresh token is available

An attacker would have access to the application for a finite time. 😎

Our regular user will also have the access token with a short lifetime. However, when the access token expires, our application will refresh it, and the user continue working as if nothing happened. 😎

The refresh token is usually harder to steal since it is only sent on one specific http-call whereas the access token is sent on every http-call. The attack surface is way smaller. But it still can be stolen:

  • 4️⃣
    - an access token with a finite expiration time has been stolen
    - a refresh token has been stolen
    - a refresh token has not been used by an intruder

Attacker would have access to the application for a finite time. 😎

As soon as the user’s access token expires, it is exchanged for a new pair of refresh and access tokens. The user continues working as if nothing happened.😎

The intruder tries to use the stolen refresh token but is no longer valid since it was already used by our user. The attacker is harmless now. 😎

For the final, ️️what if the attacker used his refresh token before the user did:

  • 5️⃣
    - an access token with a finite expiration time has been stolen
    - a refresh token has been stolen
    - a refresh token has been used by an intruder

An attacker would have access to the application for a finite time. 😎
When his access token expires, he will exchange it for new refresh and access tokens. 😨

As soon as the user’s access token expires, he will try to exchange it for new refresh and access tokens, but the intruded did that before, so the user will be logged out. 😨

As soon as the user logs in again, he will obtain new access and refresh tokens, invalidating those that intruded has.
So we are gucci again.😎😎😎

The refresh token gives other benefits too:

  • show all active user’s sessions (even for self-contained token 😃)
  • signing out a user from all sessions
  • asking users when they sign in if they want to stay signed in when inactive (the checkbox you see sometimes: “Stay signed in for X days”)

Closing Remarks

I hope the variety of information I provided hasn’t overwhelmed you 😅. Do not worry if you need to read everything a couple of times. You can skip advanced sections if you are not confident enough. I know, that by this point, you already forget half of the article anyway. Do not worry, it is totally normal 🙃. For example, I don’t even remember what I had for lunch today 👴.

When you have the opportunity, try reviewing your project to see what authentication mechanism you are using. Discuss its strengths and use cases with your colleagues. You will get a better understanding of tokens with practice. Pray the Lord, to have a refresh token in place 🙏. And if you get lucky enough, JWT will be the only token you encounter during your career 😅.

I hope you learned something new here 🤓. If you enjoyed the article, please give it a clap, it helps to boost my self-esteem 👏. Support me with a link below☕️. Don’t forget to follow for more content on authentication ✅. And remember to keep this article close at hand in case you want to refresh your mind when memory expires😉

--

--

iamprovidence

👨🏼‍💻 Full Stack Dev writing about software architecture, patterns and other programming stuff https://www.buymeacoffee.com/iamprovidence