Improving Session Token Security with Zero Downtime

Eric Higgins
Engineers @ Optimizely
3 min readJan 13, 2016

--

Updating session tokens on a high-traffic web service like we have at Optimizely presents several engineering challenges. With the current state of technology, the customer reasonably expects our service to be always-on, so taking it offline for “scheduled maintenance” is no longer an acceptable option. This requires us to perform a live migration of users, maintaining backward-compatibility, all while avoiding any customer-facing service interruptions.

At Optimizely, we’ve recently done this successfully and we’d like to share how we did it and what we learned for the benefit of the global engineering community.

Versions to the Rescue

The basic technique we used to do this was versioning. In the case of our session tokens, which are used to set a cookie on an authenticated customer’s browser, we started by defining a version property to each Session instance on the server, where version 1 corresponded to the current (or legacy) methods used to generate, hash, and validate tokens. One added benefit of this approach is that it will make future migrations trivial if the new methods become inadequate or compromised.

In the example code snippets below, we’re using the Python runtime with the Google App Engine’s NDB Datastore, but the same principle can be adapted to any language or storage layer.

...SESSION_VERSION = 1
class Session(ndb.Model):
user_id = ndb.StringProperty()
expiration = ndb.DateTimeProperty()
created = ndb.DateTimeProperty(auto_now_add=True)
modified = ndb.DateTimeProperty(auto_now=True)
version = ndb.IntegerProperty(default=SESSION_VERSION)
...

Then, we refactored the functions for generating and verifying the session tokens to accept a version parameter, which is used to call version-specific functions. Again, we deployed this code into production and there was still no user-facing impact or change.

You can see an example of how we implemented this for the generate_session_token functions below using a simple function map. The same concept applies to the methods used to validate the tokens, which we omitted for brevity.

...
def _generate_session_token_v1():
return deprecated_token_generator()

class InvalidVersion(SessionError):
"""Raised when an invalid session version is requested."""
pass

def generate_session_token(version=None):
"""Generate and return a version-specific session token."""
fn_map = {
1: _generate_session_token_v1,
}
if not version:
version = 1
fn = _generate_session_token_map.get(version)
if fn is None:
raise InvalidVersion(‘version %d is not supported’ % version)
return fn()

...

Next, we defined the improved generation and verification functions and labeled them as v2. They were not yet being used when we deployed to production, but it gave us the ability to make incremental changes and test thoroughly along the way.

def _generate_session_token_v2():
return binascii.hexlify(os.urandom(64))
def generate_session_token(version=None):
"""Generate and return a version-specific session token."""
fn_map = {
1: _generate_session_token_v1,
2: _generate_session_token_v2,
}

Flipping the Bit

At this point, all of our code is in production and ready to go live. To start the migration, we simply toggled the SESSION_VERSION value, which changed the default version from v1 to v2. This meant that all newly-created sessions would have their version property set to v2, and the v2 generation/verification functions would then be used for all new sessions. Existing sessions expire after 7 days, and continue to operate for the rest of their life. Upon expiration, a customer can re-authenticate and would be given a new v2 session token.

SESSION_VERSION = 2

This technique worked extremely well for us and we’ve successfully applied it to the hashing algorithm for customer passwords as well. The major difference is that passwords don’t expire in the same way that sessions do, which requires customers to sign in to their account to be upgraded to a new algorithm.

Want to solve more problems like this? Have a better solution?

We take security very seriously at Optimizely and are always looking for more brilliant and creative folks to solve challenging problems. Interested? We’re hiring!

--

--

Eric Higgins
Engineers @ Optimizely

Maker, inventor, engineer, nerd, & author of Security From Zero: Practical Security for Busy People