How Should We Test JVM Applications in Snowflake?

Introducing Snowflake Test Suites using Gradle

Too many Snowflake applications are deployed without adequate testing.

If you’ve read my articles over the last six months, you probably noticed I’m obsessed with building JVM applications for Snowflake: OK… I recognize that. The Snowpark team noticed as well, and I’m proud to say that my Gradle Snowflake plugin is the inspiration for Snowflake building one of their own. Thanks to Jason Freeberg for a shoutout in the README, for consulting with me before building it, and for accepting my feedback along the way. I’ll continue to update and enhance this plugin as long as teams are using it, but I’ve been invited by the Snowpark team to contribute to theirs as well. When parity exists and when theirs is available on the Gradle plugin portal, I may shift to exclusively contributing to theirs, but I’ll keep you posted.

Background Reading

A Quick Refresher

The Gradle Snowflake plugin extends the functionality of Gradle, a build tool for building JVM applications for languages like Java, Scala, Kotlin, and Groovy. Specifically, Gradle builds Java-compatible JAR libraries from any of these languages, and my plugin extends that functionality by packing all our dependencies in that JAR library, uploading the JAR to a stage in Snowflake, and creating the function or procedure defined in the plugin DSL. There are numerous language examples in the repository, but we’ll take a look a the Scala sample and see how we can deploy a function:

gh repo clone stewartbryson/gradle-snowflake &&
cd gradle-snowflake/examples/scala &&
./gradlew tasks

Our build.gradle looks like this:

plugins {
id 'scala'
id 'com.github.ben-manes.versions' version '0.46.0'
id 'io.github.stewartbryson.snowflake' version '2.0.12'
}

repositories {
mavenCentral()
}

dependencies {
implementation 'org.scala-lang:scala-library:2.13.10'
}

java {
toolchain {
languageVersion = JavaLanguageVersion.of(11)
}
}

version='0.1.0'

snowflake {
connection = 'gradle_plugin'
stage = 'upload'
applications {
add_numbers {
inputs = ["a integer", "b integer"]
returns = "string"
handler = "Sample.addNum"
}
}
}

Everything in this file is standard for building a Scala application until we get to the snowflake{} closure: this DSL is added by my Snowflake plugin. Here we define the SnowSQL connection we want to use, the stage we want to upload to, and an application called add_numbers that includes all the specifics about how to create that application, in this case, a function. We call the snowflakeJvm task, which builds, packs, and uploads the JAR and creates the function:

=❯ ./gradlew snowflakeJvm

> Task :snowflakeJvm
Using snowsql config file: /Users/stewartbryson/.snowsql/config
File scala-0.1.0-all.jar: UPLOADED
Deploying ==>
CREATE OR REPLACE function add_numbers (a integer, b integer)
returns string
language JAVA
handler = 'Sample.addNum'
imports = ('@upload/libs/scala-0.1.0-all.jar')

BUILD SUCCESSFUL in 6s
7 actionable tasks: 1 executed, 2 from cache, 4 up-to-date

The API documentation can assist in understanding the build DSL. This task was developed to be incremental and cacheable, so execution is avoided if we run it again without making any changes to our source. This is clear from the up-to-date message and can be overridden with the --rerun-tasks option:

❯ ./gradlew snowflakeJvm

BUILD SUCCESSFUL in 550ms
7 actionable tasks: 7 up-to-date

One topic I’ve neglected in the background reading is a meaningful discussion on testing: what is it, what are the different types, and how does this change when the deployment target is Snowflake? The need for basic unit testing is what inspired me to write this plugin in the first place, so I’m excited to finally address it. For build tools like Gradle, unit testing is table stakes: it’s a core tenant of the overall build chain. So let’s talk about testing, specifically the difference between unit testing and functional testing, and how we can implement both forms using Gradle and the Snowflake plugin.

Unit Tests Don’t Change when our Target is Snowflake

Unit testing with Gradle is a dense topic, and the documentation will inform developers better than I can. In general, unit tests are what developers write regardless of where the code eventually gets executed. This makes sense right? We could write a Scala library with core business logic that we want to use in Snowflake, and at the same time, we want that library available to developers building a Java application that runs in a container. In both scenarios, our unit tests would be identical.

In the examples directory in the Gradle Snowflake repository, I’ve added a sample that demonstrates both unit and functional testing for Java. Everything in that sample would also work for other JVM languages with a few tweaks.

cd ../java-testing

I’ve included a sample testing specification (spec) written using the Spock Framework, which is a clean, modern testing framework that is popular with JVM development teams. Spock specs are written using Groovy, so our example SampleTest.groovy is in the src/test/groovy directory:

import spock.lang.Specification
import spock.lang.Subject

class SampleTest extends Specification {
@Subject
def sample = new Sample()

def "adding 1 and 2"() {
when: "Two numbers"
def a = 1
def b = 2

then: "Add numbers"
sample.addNum(a, b) == "Sum is: 3"
}

def "adding 3 and 4"() {
when: "Two numbers"
def a = 3
def b = 4

then: "Add numbers"
sample.addNum(a, b) == "Sum is: 7"
}
}

This spec will now run anytime we execute either the build or test task:

=❯ ./gradlew build

> Task :test

SampleTest

Test adding 1 and 2 PASSED
Test adding 3 and 4 PASSED

SUCCESS: Executed 2 tests in 475ms

BUILD SUCCESSFUL in 3s
9 actionable tasks: 9 executed

All Gradle testing tasks are automatically incremental and cacheable and would be avoided if executed again without code changes in either the source or the spec. This is also true for the functional test suites described below.

Functional Tests are Executed in Snowflake

Functional testing is sometimes called “black-box testing” because it tests what the software does without necessarily understanding how it does it. I would argue that the only way to test the functionality of Snowflake applications is to test the deployed code in Snowflake. One of the reasons I’ve avoided the testing subject in my articles so far is I didn’t have a good answer for how to do functional testing.

Beginning with release 2.0.0, this plugin now contains a custom Spock spec called SnowflakeSpec that can be used for building functional test suites for Snowflake. By default, the plugin is designed to handle a suite called functionalTest, though the name can be configured using the testSuite property in the snowflake{} DSL. In our build.gradle file, we can use Gradle’s built-in JvmTestSuite plugin to configure thefunctionalTest suite that the plugin expects. The DSL provided by Gradle in this plugin is convoluted (in my opinion), but no one asked me:

functionalTest(JvmTestSuite) {
targets {
all {
useSpock('2.3-groovy-3.0')
dependencies {
// library not plugin coordinates
implementation "io.github.stewartbryson:gradle-snowflake-plugin:2.0.13"
}
testTask.configure {
failFast true
// which SnowSQL connection to use
systemProperty 'connection', snowflake.connection
}
}
}
}

I’ll walk through a few points. So that SnowflakeSpec is available in the test classpath, we have to declare the plugin as a dependency of the test suite with the dependencies{} DSL. Notice that we use the library maven coordinates, which are different than the coordinates in the plugins{} DSL. Additionally, our test specs are unaware of all the configurations of our Gradle build, so we have to pass our connection property as a Java system property to the SnowflakeSpec class.

This is the SnowflakeSampleTest.groovy spec in src/functionalTest/groovy:

import groovy.util.logging.Slf4j
import io.github.stewartbryson.SnowflakeSpec

/**
* The SnowflakeSpec used for testing functions.
*/
@Slf4j
class SnowflakeSampleTest extends SnowflakeSpec {

def 'ADD_NUMBERS() function with 1 and 2'() {
when: "Two numbers exist"
def a = 1
def b = 2

then: 'Add two numbers using ADD_NUMBERS()'
selectFunction("add_numbers", [a,b]) == 'Sum is: 3'
}

def 'ADD_NUMBERS() function with 3 and 4'() {
when: "Two numbers exist"
def a = 3
def b = 4

then: 'Add two numbers using ADD_NUMBERS()'
selectFunction("add_numbers", [a,b]) == 'Sum is: 7'
}

}

The selectFunction() method in SnowflakeSpec is an easy way to execute a function and test the results by accepting the function name and any required arguments. And of course, this executes against Snowflake in real-time:

=❯ ./gradlew functionalTest

> Task :test

SampleTest

Test adding 1 and 2 PASSED
Test adding 3 and 4 PASSED

SUCCESS: Executed 2 tests in 507ms

> Task :snowflakeJvm
Using snowsql config file: /Users/stewartbryson/.snowsql/config
File java-testing-0.1.0-all.jar: UPLOADED
Deploying ==>
CREATE OR REPLACE function add_numbers (a integer, b integer)
returns string
language JAVA
handler = 'Sample.addNum'
imports = ('@upload/libs/java-testing-0.1.0-all.jar')

> Task :functionalTest

SnowflakeSampleTest

Test ADD_NUMBERS() function with 1 and 2 PASSED (1.2s)
Test ADD_NUMBERS() function with 3 and 4 PASSED (1.2s)

SUCCESS: Executed 2 tests in 4.5s

BUILD SUCCESSFUL in 14s
11 actionable tasks: 11 executed
We can see in SnowSight that our tests were executed.

Bringing it All Together with Ephemeral Testing

Running functional tests using static Snowflake databases is boring, especially considering the zero-copy cloning functionality available. The plugin supports cloning an ephemeral database from the database we connect to and using it for testing our application. This workflow is useful for CI/CD processes and is configured with the snowflake{} DSL. The plugin is aware when it is running in CI/CD environments via the CI Detect Gradle plugin and currently supports:

We expose the CI/CD information through the plugin and can use it to control our cloning behavior:

snowflake {
connection = 'gradle_plugin'
stage = 'upload'
useEphemeral = snowflake.isCI() // use ephemeral with CICD workflows
keepEphemeral = snowflake.isPR() // keep ephemeral for PRs
applications {
add_numbers {
inputs = ["a integer", "b integer"]
returns = "string"
handler = "Sample.addNum"
}
}
}

The useEphemeral property will determine whether the createEphemeral and dropEphemeral tasks are added at the beginning and end of the build, respectively. This allows for the functionalTest task to execute in the ephemeral database just after our application is published. We've also added a little extra magic to keep the clone when building a pull request. The createEphemeral task issues a CREATE DATABASE... IF NOT EXISTS statement, so it will not fail if the clone exists from a prior run. Remember that our SnowflakeSpec class doesn't automatically know the details of our build, so we have to provide the ephemeral name using Java system properties. Here is our modified testing suite:

functionalTest(JvmTestSuite) {
targets {
all {
useSpock('2.3-groovy-3.0')
dependencies {
implementation "io.github.stewartbryson:gradle-snowflake-plugin:1.1.4"
}
testTask.configure {
failFast true
// which SnowSQL connection to use
systemProperty 'connection', snowflake.connection
// if this is ephemeral, the test spec needs the name to connect to
if (snowflake.useEphemeral) {
systemProperty 'ephemeralName', snowflake.ephemeralName
}
}
}
}
}

Below is the output from a GitHub Action I use to test all the examples during a pull request:

=❯ ./gradlew functionalTest

> Task :createEphemeral
Using snowsql config file: /home/runner/.snowsql/config
Ephemeral clone EPHEMERAL_JAVA_TESTING_PR_94 created if not exists.

> Task :test

SampleTest

Test adding 1 and 2 PASSED
Test adding 3 and 4 PASSED

SUCCESS: Executed 2 tests in 1.6s

> Task :snowflakeJvm
Reusing existing connection.
File java-testing-0.1.0-all.jar: UPLOADED
Deploying ==>
CREATE OR REPLACE function add_numbers (a integer, b integer)
returns string
language JAVA
handler = 'Sample.addNum'
imports = ('@upload/libs/java-testing-0.1.0-all.jar')

> Task :functionalTest

SnowflakeSampleTest

Test ADD_NUMBERS() function with 1 and 2 PASSED (1.2s)
Test ADD_NUMBERS() function with 3 and 4 PASSED

SUCCESS: Executed 2 tests in 7s

BUILD SUCCESSFUL in 2m 19s
12 actionable tasks: 12 executed

When the CI/CD environment is detected, the plugin will name the ephemeral database clone based on the pull request number, the branch name, or the tag name instead of an autogenerated one. If we prefer to simply specify a clone name instead of relying on the plugin to generate it, that is supported as well:

useEphemeral = true
keepEphemeral = false
ephemeralName = "testing_db"
=❯ ./gradlew functionalTest

> Task :createEphemeral
Using snowsql config file: /Users/stewartbryson/.snowsql/config
Ephemeral clone testing_db created if not exists.

> Task :test

SampleTest

Test adding 1 and 2 PASSED
Test adding 3 and 4 PASSED

SUCCESS: Executed 2 tests in 460ms

> Task :snowflakeJvm
Reusing existing connection.
File java-testing-0.1.0-all.jar: UPLOADED
Deploying ==>
CREATE OR REPLACE function add_numbers (a integer, b integer)
returns string
language JAVA
handler = 'Sample.addNum'
imports = ('@upload/libs/java-testing-0.1.0-all.jar')

> Task :functionalTest

SnowflakeSampleTest

Test ADD_NUMBERS() function with 1 and 2 PASSED (1.3s)
Test ADD_NUMBERS() function with 3 and 4 PASSED (1.3s)

SUCCESS: Executed 2 tests in 5.3s

> Task :dropEphemeral
Reusing existing connection.
Ephemeral clone testing_db dropped.

BUILD SUCCESSFUL in 15s
13 actionable tasks: 13 executed

Contributing

If you are interested in opening a feature request or bug report, please do so. I’d love to hear what you think of the functional testing and any missing features you think it should have. The project README has a section on contributing guidelines: how to run the unit, functional and integration tests, and which branch to contribute against.

--

--

Stewart Bryson
Snowflake Builders Blog: Data Engineers, App Developers, AI/ML, & Data Science

Snowflake Data Superhero | Oracle ACE Alum | Writer, speaker, podcast guest | Amateur cyclist | Professional philosopher