Native Apps: The “Why” of Versioning and Upgrade

Snowflake Native Apps are self-contained applications that run entirely within the Snowflake data platform. Applications are discovered on the Snowflake marketplace and installed directly within customer accounts, and provide extensive safeguards that provide customers with complete control of what the application can access within their environment as well as protections for the developer, ensuring that their intellectual property (the implementation of the application and sensitive data) is shielded from the customer.

The definition of the application (the “version”) resides and is maintained in the Developer’s account and, as new versions are created and tested, they are explicitly released by the developer at which time they automatically begin rolling out — progressively upgrading applications in customers accounts in the background, even while the application is in use.

Native Apps takes a highly opinionated approach to this process of versioning and upgrades. The public docs do a fine job of explaining the “how” this process works but, to many, it feels overly complex yet, at the same time, way too restrictive. I think this is because the “why” of the approach isn’t evident.

Why do upgrades work the way they do? Let’s talk about that.

How It Works

A quick primer. Native apps allows at most two major version of an application to exist:

  • Each version may have many patches
  • At any given time, different accounts can be on different versions or patches (using release directives), or may be in the process of upgrading

BUT, if you want to add a third version, then you must first upgrade everyone off of the oldest version:

THEN, drop the first version:

And finally you can start adding your third version, and migrating accounts:

And, there are a few important details in there about this process:

  • Even though an account is upgraded to a new version, code can still be running that was started on the previous version
  • You cannot DROP a version until everyone is upgraded and no more code is running on that version
  • If even one single account fails to upgrade, you have to fix it to get it off before you can drop the version.

See, it seems both overly simple and overly complex!

It’s All About That State (♫ ‘bout that state ♫)

This upgrade strategy is all about protecting application state. But, what do we mean by “state”?

An application is composed of stateless components that are upgraded from version to version, and stateful components that are shared between versions. Typically your stateless component is the code of the app (functions, procedures, streamlits, etc.), and the stateful content is data (tables). A typical native app installation script may look like the following:

> CREATE APPLICATION ROLE app_admin;
> CREATE APPLICATION ROLE app_user;
> GRANT APPLICATION ROLE app_user TO APPLICATION ROLE app_admin;

CREATE OR ALTER VERSIONED SCHEMA code;
> GRANT USAGE ON SCHEMA code TO APPLICATION ROLE app_user;

CREATE OR REPLACE PROCEDURE code.do_thing() ... ;
> GRANT USAGE ON PROCEDURE core.do_thing() TO APPLICATION ROLE app_user;

CREATE OR REPLACE PROCEDURE code.set_param(
name string, value string, comment string) ... ;
> GRANT USAGE ON PROCEDURE code.set_param(string, string, string) ...;
> TO APPLICATION ROLE app_admin;

> CREATE SCHEMA config IF NOT EXISTS;
> CREATE TABLE config.params IF NOT EXISTS (
> param_name STRING,
> param_value STRING,
> change_time TIMESTAMP,
> comment STRING,
> );
> ALTER TABLE config.params ADD COLUMN IF NOT EXISTS comment STRING;

> indicates stateful content.

A few highlights here:

  • The config.params table is created in a new installation, but this release added the comments column, so if the table exists we are upgrading, so the column is added.
  • code.set_param() is a function that sets parameters. This new version adds a comment parameter.

During upgrade, users calling set_param() will see the old two parameter version (thanks to VERSIONED SCHEMA) and, thus won’t be inserted into the comment column. After the upgrade, set_param() requires a comment to be provided which will be inserted.

The key points here are:

  • The state changes had to be compatible with code from the previous version
  • The new version had to be prepared to deal with data being produced by the previous version

We call this the compatibility window. For native apps, we provide a guaranteed compatibility window of two major versions.

What are the CREATE APPLICATION ROLE and GRANT statements considered state? Application roles are not versioned, so if a new release drops a role or revokes a grant to an object used in the previous release, it can break the running code of that release!

Two Versions & State Compatibility Contract

Native Apps has a compatibility window of two major versions, giving providers the following contract when they are developing their applications:

  1. Code at version N-1 must be able to handle state for both version N-1 and version N (this includes all patches of N-1 and N)
  2. Code at version N will never execute on state less than N. So if an upgrade to version N fails, then we will not run any code on version N

We enforce this by restricting an application to at most two versions at a time (N-1 and N). If you wish to add a version N+1, then first you must migrate all instances to N and, when that is complete, drop version N-1.

The TL;DR is that a developer can easily reason about their setup script. They only need to worry about compatibility with the immediately previous version.

Versions and Finalization

There is some nuance regarding executing code. Upgrades can happen while users are executing requests; this means that a stored procedure could be running on version N-1, while the application has been successfully upgraded to N. In order to enforce rule #1 (code at N-1 need only be compatible with state at N), we also ensure that version N-1 may not be dropped until all code has finished execution on it, even though the application itself is actually on version N. We call this process finalization.

In this diagram, Alice and Brenda are both using version 1.x when an upgrade to 2.x comes along. Alice was calling a long-running stored procedure (SP1) which was still executing after the upgrade completed, so she stays “pinned” to that version for the duration of her query, meanwhile the next call that Brenda makes will use version 2.0. Version 1.x may not be dropped until Alice finishes her query.

Finalization acts as a safety barrier. When version 3.x comes along, its setup script is guaranteed that there could be no running code from 1.x.

Patches

A challenge with finalization is that queries in Snowflake may run for days, holding up the deployment of a new version for a significant amount of time, but what if we need to fix a critical issue immediately and don’t have time to wait for finalization?

In Native Apps, finalization only happens on major version upgrades (versions N.x to version N+1.x), however patches on the current version do not wait for finalization. This allows for patches to roll out without waiting, but also means that state should (generally) not be modified by a patch. For example, consider this scenario:

The application has been patched several times, from v1.0 to v1.4. A number of users, however, had been executing long running queries at the time these patches were deployed, so they are still executing code at those patch levels. Were a patch to introduce a state change, that patch must be compatible with the code from all other patches!

And, it gets worse when the next version comes along:

The state change must also be compatible with all patches of the next version.

The only time one should consider a state change in a patch is if the initial version (patch 0) had a bug that failed to upgrade the state from the previous version. Other than that, don’t do it. Just don’t.

Features vs. State

Note that just because you shouldn’t (musn’t!) add new state in a patch doesn’t mean that you cannot add new features. The two are frequently coupled together, however you can certainly introduce new features in a patch provided they do not introduce state changes (tables, columns, etc.) and do not manipulate data in a way that would be incompatible with code from other running versions and patches.

(No) Rolling Upgrades

The contract above could also be achieved without the two version restriction if we allowed for rolling upgrades. Rolling upgrades allow different instances to be at different versions at any given time (you can have stragglers) and, should an instance be asked to go from version N to, say, N+4, we will automatically roll the upgrade forward through N+1, N+2, N+3, and finally N+4.

Rolling upgrades have the advantage that you could, say, pin an account at a given version, and that a handful of instances failing upgrade do not hold up adding new features (versions) for the rest of the accounts — the failed instances can be patched later to eventually catch them up with the rest of the pack.

On the surface this looks great, however rolling upgrades can lead to unexpected brittleness. Snowflake itself is evolving underneath all of this and those old versions may become impacted by behavior changes. Once this occurs, it may require not just patching the version that the acccount is sitting on, but multiple intermediate versions that are needed to roll the account up to the latest release! The longer an instance falls behind the release chain the more delicate the surgery it may be to move it forward.

Upgrades: The Snowflake Way

Although it may not be readily apparent, this policy to upgrades is modeled upon Snowflake itself:

  • In any given region there is a current version (N) and, during an upgrade, a version that is being rolled out (N+1)
  • A region must be fully upgraded to N+1 (so no more N is running) before a new version may begin to be rolling out
  • A regression during a rollout results in either a rollback, or a patch (perhaps reverting or fixing a feature)
  • Customers have no ability to prevent Snowflake from upgrading. They may opt out of new features and behavior changes (for a time), however the underlying code always marches forward.

This approach to versioning and upgrades is one of the ways in which Snowflake is able to maintain such a stable platform.

The Future of Versioning and Upgrades

I can’t really talk about product roadmap here, however it is not uncommon for application developers as well as users to request more control over how and when versions are rolled out and we will continue to evolve the platform with more flexibility in mind. However, any approach that is taken needs to continue to strike a balance between expressive power in versioning and upgrades and the ability to easily shoot yourself in the foot in unexpected and exciting ways.

--

--

Scott Gray
Snowflake Builders Blog: Data Engineers, App Developers, AI/ML, & Data Science

Scott is a Principal Software Engineer at Snowflake, leading the Native Apps Foundation Team.