Using a LATERAL
join and the SPLIT_TO_TABLE
function.
Do you have a column in your Snowflake table that contains delimited strings that you want to split into individual rows?
For example, let’s say we have a table called countries_official_languages
, which contains two columns: country
and official_languages
.
The data might look something like this:
However we want each language to be on its own row:
You can do this in Snowlake by taking advantage of the SPLIT_TO_TABLE
function and LATERAL
keyword.
SPLIT_TO_TABLE
Snowflake offers the handy SPLIT_TO_TABLE
function, which “splits a string (based on a specified delimiter) and flattens the results into rows.”
Here’s an example of it in use:
SELECT table1.value
FROM table(SPLIT_TO_TABLE('Arabic, Berber', ',')) AS table1
Output:
+-------+
| VALUE |
|-------|
| Arabic|
| Berber|
+-------+
By specifying a comma as the delimiter, Snowflake has split the string into rows. Great!
However, as the Snowflake documentation points out, “table functions are used in the FROM clause of a SQL statement”.
We need to execute the SPLIT_TO_TABLE
for each row in our country_languages
table. Plus, we need the country
column included also so we know which languages belong to which countries.
How do we do this? Using a LATERAL
join.
LATERAL joins
A lateral join is different from a regular join like INNER JOIN OR LEFT JOIN
in that it “allows an inline view to reference columns from a table expression that precedes that inline view.”
That inline view can be a subquery, table function or an inline view (a view defined within the statement, and valid only for the duration of the statement).
Confused? What’s important to understand is what happens when you use LATERAL
:
for each row in left_hand_table LHT:
execute right_hand_subquery RHS using the values from
the current row in the LHT
As you might have guessed, we can take advantage of the LATERAL
keyword in order to execute the SPLIT_TO_TABLE
function for each row in our countries_official_languages
table:
SELECT country, TRIM(VALUE) AS language
FROM countries_official_languages, -- Don't forgot the comma!
LATERAL SPLIT_TO_TABLE(official_languages, ',')
Let’s examine this in more detail. Let’s take the first row in our table as an example of what happens when we execute the above query:
|country|official_languages|
|-------|------------------|
|Algeria|Arabic, Berber |
- This row is passed to the right-hand inline view, which is in this case is a table function —
SPLIT_TO_TABLE
. - The table function returns two rows, for Arabic and Berber, in a column called
VALUE
. - In our
SELECT
statement we select thecountry
column and theVALUE
column. I’ve wrapped theTRIM
function around theVALUE
column to remove any whitespace. - This final result set is appended to what will be the final output once every row from the table has been processed.
And here is our final output:
And that’s it!
I want to note that using the LATERAL
keyword with a subquery utilises different syntax — usually you specify the join condition within the WHERE
clause of the right-hand subquery but this is beyond the scope of this article.
If you want to learn more about using LATERAL
keyword with a subquery I highly recommend this concise, insightful article:
If you enjoyed this article then follow me on Medium and Twitter/X.