New in Snowflake: Java UDFs (with a Kotlin NLP example)

Snowflake now let’s you easily create Java UDFs, which is an incredibly powerful and versatile feature. Let’s check it out with by running a library written in Kotlin — to detect written languages. Out of GitHub and into your SQL code, in 3 easy steps.

Felipe Hoffa
Jun 16 · 3 min read
Detect written languages with a Java UDF

Quick example: Detect written languages

Snowflake has made it really easy to create Java UDFs. You just need to do something like this:

create function add(x integer, y integer)
returns integer
language java
handler='Test.add'
as
$$
class Test {
public static int add(int x, int y) {
return x + y;
}
}
$$;
select add(1, 3)// 4
create or replace function detect_lang(x string)
returns string
language java
imports = ('@~/lingua-1.1.0-with-dependencies.jar')
handler='MyClass.detect'
as
$$
import com.github.pemistahl.lingua.api.*;
import static com.github.pemistahl.lingua.api.Language.*;
class MyClass {
static LanguageDetector detector = LanguageDetectorBuilder.fromLanguages(ENGLISH, FRENCH, GERMAN, SPANISH).build();
public static String detect(String x) {
return detector.detectLanguageOf(x).toString();
}
}
$$;
select $1, detect_lang($1)
from values('languages are awesome'),('hola mi amigo'),('hallo Freunde'),('je ne parle pas');

Notes

  • I love the ability to write custom Java code while defining UDFs within SQL (Snowflake takes care of compiling it). This allows you to handle all your glue code in one place, within your SQL scripts and dbt projects.

Setup

Lingua is “The most accurate natural language detection library for Java and the JVM, suitable for long and short text alike” (according to them). It’s written in Kotlin, which is a language that runs on the JVM.

  1. Build a jar with dependencies with Gradle.
git clone https://github.com/pemistahl/lingua.git
./gradlew jarWithDependencies
put 'file://build/libs/lingua-1.1.0-with-dependencies.jar' @~;

Read more

Acknowledgements

Java UDF support is in active development by a great team at Snowflake, including Elliott Brossard and Isaac Kunen. Stay tuned for more!

Want more?

Snowflake

Articles for engineers, by engineers.

Snowflake

Snowflake articles from engineers using Snowflake to power their data.

Felipe Hoffa

Written by

Data Cloud Advocate at Snowflake ❄️. Originally from Chile, now in San Francisco and around the world. Previously at Google. Let’s talk data.

Snowflake

Snowflake articles from engineers using Snowflake to power their data.