Snowpark for Java and Scala: now open-source on GitHub

The repository for the Java and Scala Snowpark API is now open-source on GitHub! You can view the code at SnowflakeDB/snowpark-java-scala.

Want to get oriented with the code base? Read on below!

The Snowpark API for Java and Scala is now public on GitHub

Code Overview

The repository is structured as a typical Scala project:

As you can see above, the main directory is split into separate directories for the Java and Scala APIs, and the vast majority of the API is implemented in Scala.

In the Java sources there are separate directories for the Scala import (under “snowpark”) and for the Java implementation (under “snowpark_java”).

At build time, the Scala API is packaged and included in the snowpark/ directory in the Java sources and is referenced throughout the Java implementation. For example, here is an abridged version of DataFrame.java:

package com.snowflake.snowpark_java; // This is the Java implementation (hence, "_java")

public class DataFrame extends Logging implements Cloneable {
private final com.snowflake.snowpark.DataFrame df; // Scala DF is referenced (no "_java" in package name)

DataFrame(com.snowflake.snowpark.DataFrame df) {
this.df = df;
}

// Rest of DataFrame implementation...
}

The snowpark_java.DataFrame class references snowpark.DataFrame from the Scala project and uses it throughout the implementation. If you poke around the repo, you’ll see the same pattern repeated for many of the Java classes.

--

--