Snowpark for Java and Scala: now open-source on GitHub
The repository for the Java and Scala Snowpark API is now open-source on GitHub! You can view the code at SnowflakeDB/snowpark-java-scala.
- See the list of current issues, and comment/upvote on the issues which are relevant to you
- Check out the contributing guide for guidance on making pull requests
- Open an issue if you have encountered a bug or want to recommend an improvement
Want to get oriented with the code base? Read on below!
Code Overview
The repository is structured as a typical Scala project:
- src/main/java: Java API sources root
- src/main/scala: Scala API sources root
- src/test: project test cases
- pom.xml: build file and dependencies
As you can see above, the main directory is split into separate directories for the Java and Scala APIs, and the vast majority of the API is implemented in Scala.
At build time, the Scala API is packaged and included in the snowpark/
directory in the Java sources and is referenced throughout the Java implementation. For example, here is an abridged version of DataFrame.java:
package com.snowflake.snowpark_java; // This is the Java implementation (hence, "_java")
public class DataFrame extends Logging implements Cloneable {
private final com.snowflake.snowpark.DataFrame df; // Scala DF is referenced (no "_java" in package name)
DataFrame(com.snowflake.snowpark.DataFrame df) {
this.df = df;
}
// Rest of DataFrame implementation...
}
The snowpark_java.DataFrame
class references snowpark.DataFrame
from the Scala project and uses it throughout the implementation. If you poke around the repo, you’ll see the same pattern repeated for many of the Java classes.