Splitting a delimited string column into rows using Snowflake

Ben Nour
3 min readJul 20, 2023

Using a LATERAL join and the SPLIT_TO_TABLE function.

Photo by Jonas Jacobsson on Unsplash

Do you have a column in your Snowflake table that contains delimited strings that you want to split into individual rows?

For example, let’s say we have a table called countries_official_languages, which contains two columns: country and official_languages.

The data might look something like this:

However we want each language to be on its own row:

You can do this in Snowlake by taking advantage of the SPLIT_TO_TABLE function and LATERAL keyword.

SPLIT_TO_TABLE

Snowflake offers the handy SPLIT_TO_TABLE function, which “splits a string (based on a specified delimiter) and flattens the results into rows.”

Here’s an example of it in use:

SELECT table1.value 
FROM table(SPLIT_TO_TABLE('Arabic, Berber', ',')) AS table1

Output:

+-------+
| VALUE |
|-------|
| Arabic|
| Berber|
+-------+

By specifying a comma as the delimiter, Snowflake has split the string into rows. Great!

However, as the Snowflake documentation points out, “table functions are used in the FROM clause of a SQL statement”.

We need to execute the SPLIT_TO_TABLE for each row in our country_languages table. Plus, we need the country column included also so we know which languages belong to which countries.

How do we do this? Using a LATERAL join.

LATERAL joins

A lateral join is different from a regular join like INNER JOIN OR LEFT JOIN in that it “allows an inline view to reference columns from a table expression that precedes that inline view.”

That inline view can be a subquery, table function or an inline view (a view defined within the statement, and valid only for the duration of the statement).

Confused? What’s important to understand is what happens when you use LATERAL:

for each row in left_hand_table LHT:
execute right_hand_subquery RHS using the values from
the current row in the LHT

As you might have guessed, we can take advantage of the LATERAL keyword in order to execute the SPLIT_TO_TABLE function for each row in our countries_official_languages table:

SELECT country, TRIM(VALUE) AS language
FROM countries_official_languages, -- Don't forgot the comma!
LATERAL SPLIT_TO_TABLE(official_languages, ',')

Let’s examine this in more detail. Let’s take the first row in our table as an example of what happens when we execute the above query:

|country|official_languages|
|-------|------------------|
|Algeria|Arabic, Berber |
  1. This row is passed to the right-hand inline view, which is in this case is a table function — SPLIT_TO_TABLE.
  2. The table function returns two rows, for Arabic and Berber, in a column called VALUE.
  3. In our SELECT statement we select the country column and the VALUE column. I’ve wrapped the TRIM function around the VALUE column to remove any whitespace.
  4. This final result set is appended to what will be the final output once every row from the table has been processed.

And here is our final output:

And that’s it!

I want to note that using the LATERAL keyword with a subquery utilises different syntax — usually you specify the join condition within the WHERE clause of the right-hand subquery but this is beyond the scope of this article.

If you want to learn more about using LATERAL keyword with a subquery I highly recommend this concise, insightful article:

If you enjoyed this article then follow me on Medium and Twitter/X.

--

--