The self-verification JWT journey

Chad Fisher
Spring Health Engineering
6 min readAug 18, 2023

Authorization in the web has many layers:

  • Server side auth
  • Client side “auth”
  • Single Sign On (SSO)
  • Session Data / Cookies
  • JSON Web Tokens (JWTs) (I.e. access_token , these will be used interchangeably)

All of these layers should effectively work together to provide a secure (and hopefully quick) way of authorizing a user initially and for subsequent requests.

This article will be focusing on JWT external validation vs. self validation, the benefits / cons of both, and how we progressed our application at Spring Health to utilize more of these standards.

What is a JWT — A Quick Refresher:

A JWT (JSON web token) is a digitally signed string consisting of 3 parts — A header, a payload, and a signature. Its primary function is to authorize a user by passing it as header to a backend web service via the Authorization header. For a better description, check out this article on jwt.io (jwt.io is my go to for handling live tokens and pretty much everything related to them).

A Little About the JWT and its Lifecycle:

The current web (as of 2023) has generally moved towards the idea of short lived JWTs (30 minute standard) and the refresh_token TTL being around 12 hours.

This makes total sense! In a world where you can distribute a valid access_token (essentially giving ANYONE who has that access_token access for the duration of its life) SHOULD be short lived (we will get into how to “invalidate” these tokens later, which is not the most trivial problem).

Then comes along the beauty of the refresh_token. When an access_token is set to expire soon, your front end application can be “smart enough” to send a refresh_token grant_type to your authorization server in order to preserve the ongoing session. Essentially REQUIRING a user to re-authorize their session. The important implication here is that you require your users to have a relatively constant (depending on your access_token life span) communication with your auth server in order to maintain a session.

The Forceful (and slow) Way of Validation:

At the beginning of Spring Health’s start up journey, the way we validated a users session was through a direct CURL to the Auth Service on EVERY in-bound request to one of our Backend Services.

The request flow would look something like this (see image for details):

  • The front end React App would make a call initially to the Auth Service to get a valid access_token
  • Then it would make requests to the Backend Services using the Authorization: Bearer access_token
  • Subsequently, any given request to the Backend Service, would initially forward that request to the Auth Service to validate the given JWT
Forceful JWT Validation

The Benefits of Forceful JWT Validation:

  • Each request is forcefully validated — Ensuring if an access_token was revoked, that request will be denied
  • It’s simplified and straightforward — All auth is maintained in the Auth Service

The Cons of Forceful JWT Validation:

  • Each request to a Backend service must rely on a CURL request from the Auth Service (aka the Auth Service must be up and running — duh!)
  • Latency is added to each request for the time it takes for the CURL to execute.

Digging a little deeper (cause we are nerds and like to see the data), we can look at a specific example of a single request to our backend in real-time:

Specific Real-Time Force JWT validation Example

The above image has a fairly clear division down the middle — And in fact, this division represent 2 things:

  • The left side is essentially the validation request being sent to the auth service
  • The right side is the subsequent processing of the request after a successful validation

The cons are fairly straightforward here (thanks to DataDog!) — The authorization portion of this request took 56.7 ms which represents 38.8% of the total time (146 ms) it took to full-fill this single request.

A step in the right direction — Caching JWT validity:

Our first attempt to mitigate this issue was to cache (via our Redis store) any Auth Service responses when we saw a unique JWT submitted to our Backend services.

We had our hopes high, thinking “This is great, we will mitigate so many requests to our Auth Service!”.

At first things were going as expected, we had roughly ~96% Cache Hit Rate, reducing our outbound CURLs. But then we saw some interesting behavior in our Cache Hit Rate graph over time:

Cache hit rate over time (see the MASSIVE drop???)

The above graph shows these massive spike drops occasionally, where our cache hit rate for a few seconds fell almost to 0.

We later figured out, these drops were caused by a flurry of subsequent requests from a single, auth’d user to the backend. On specific heavy load pages in our front end, we were sending a bunch of requests to different backend graphQL endpoints. So the front end would send off (lets say 20 async requests) on a specific page load, to gather information up front.

In turn, the backend would get all these requests (in a very small time interval) BEFORE it had cached the response from the Auth Service for that SPECIFIC access_token , essentially bypassing the caching logic and needing to do a lookup from the Auth Service against the same access_token multiple times in quick succession. Eventually the backend would catch up and cache the access_token response from the Auth Service and mitigate future CURL requests for that same token (aka we don’t need to do an Auth Service lookup since we have the token already validated in the Redis store).

Needless to say, we didn’t fully resolve the problem.

JWT Self-Validation — The performance gains are palpable:

Finally, we decided it was time to introduce JWT self (public) validation (for more information on JWT self-validation see https://software-factotum.medium.com/validating-rsa-signature-for-a-jws-10229fb46bbf):

The process we implemented:

  • The Auth Service was already producing RS256 generated tokens
  • We had a public key / private key combination from the Auth Service
  • Implement logic in our Backend Service (behind a Ruby / Flipper flag) to self validate the incoming access_tokens instead of reaching out to the Auth Service

In order to measure this performance gain, we logged the execution time it took from once we received an access_token till we full validated the round trip response from the Auth Service.

Since this happens often, we limited the time for data aggregation to Wednesday mornings from 9:00 AM EST till 12:00 PM EST (3 hour window)

The results of the data:

+--------------------------------+---------------------+-----------------+
| - | Forceful Validation | Self Validation |
+--------------------------------+---------------------+-----------------+
| Total Time (sum of all events) | 697,058.91 ms | 5084.08 ms |
| Average | 95.12 ms | 0.76 ms |
| Worst Case | 4,603.54 ms | 70.10 ms |
| Best Case | 35.69 ms | 0.49 ms |
+--------------------------------+---------------------+-----------------+

Highlights of the above data:

  • Our WORST CASE with self-validation is 20 ms BETTER than the average of our validation process before self-validation.
  • Some poor soul had to wait 4.6 SECONDS for a SINGLE CURL under Forceful Validation
  • We saved our users 691.97 seconds within the 3 hour window using the new self-validation method.
  • The self-validation method is approximately ~121.94 times faster than the old method (calculated by taking the average-old / average-new)

Conclusion:

The JWT self-validation life is the way to go for performance — without question.

The good news is, we still have the feature flag in place to do forceful validation (CURL request to the Auth Service) in the case we ever need to ensure an access_token hasn’t been revoked (in the off chance of a security breach).

In the spirit of Spring Health’s 10x growth and 10x speed, this performance upgrade certainly makes the cut.

Acknowledgments:

Shoutout to Rob Durst who inspired me to write this blog, and all my wonderful teammates along the way — Go platform team!

--

--