Create a state dimension table in one SQL

Vivian
3 min readFeb 17, 2023

--

The Australia states are usually stored in data as abbreviations. Storing a dimension table is more handy than repeatedly writing the conditional expression to interpret the abbreviation to full state name.

Creating a dimension table can use the traditional method of creating a table to insert values.

create table public.dim_australia_state
(
state_name varchar(50),
state_abbrv varchar(3)
);

insert into dim_australia_state values ('New South Wales','NSW');
insert into dim_australia_state values ('Victoria','VIC');
insert into dim_australia_state values ('Australian Capital Territory','ACT');
insert into dim_australia_state values ('Queensland','QLD');
insert into dim_australia_state values ('South Australia','SA');
insert into dim_australia_state values ('Western Australia','WA');
insert into dim_australia_state values ('Tasmania','TAS');
insert into dim_australia_state values ('Northern Territory','NT');

In Snowflake, the multiple statements can’t be run at once, unless quoted them in a Snowflake Scripting block ‘BEGIN.. END’.

BEGIN
insert into dim_australia_state values ('New South Wales','NSW');
insert into dim_australia_state values ('Victoria','VIC');
insert into dim_australia_state values ('Australian Capital Territory','ACT');
insert into dim_australia_state values ('Queensland','QLD');
insert into dim_australia_state values ('South Australia','SA');
insert into dim_australia_state values ('Western Australia','WA');
insert into dim_australia_state values ('Tasmania','TAS');
insert into dim_australia_state values ('Northern Territory','NT');
END;

However, it is possible to insert duplicate rows by mistakes. There is more control to load such small dimension table in one SQL. The unstructured data type JSON object with key and value pair is most suitable in this situation.

The string is quoted by curly brackets, the key and value are separated by colon ‘:’. The key must be a string wrapped in double quotation and placed on the left of the colon. Multiple pair of elements are separated by comma ‘,’.

'{"a":100,"b":"test"}'

The above string can be converted to a variant data type in Snowflake using ‘parse_json’ function.

select parse_json('{"a":100,"b":"test"}');

The ‘flatten’ function with ‘table’ function can convert the key and value pair to two columns.

select *
from table(flatten(input=>parse_json('{"a":100,"b":"test"}')));

The THIS column contains the full unstructured data. The KEY and VALUE columns are what we are after.

The complete code that converts each column to correct data type and directly saves to a newly created table is as below. Adding ‘or replace’ in the ‘create table’ statement to allow refreshing the data if needed.

create or replace table Dim_Australia_State as 
select
key::varchar(50) as state_name,
value::varchar(3) as state_abbrv
from table(flatten (input=>parse_json('{
"New South Wales":"NSW",
"Victoria":"VIC",
"Australian Capital Territory":"ACT",
"Queensland":"QLD",
"South Australia":"SA",
"Western Australia":"WA",
"Tasmania":"TAS",
"Northern Territory":"NT"
}')));
Data output
Table description

--

--

Vivian

12+ year experience of finance data analytics. Advanced SQL, R, JavaScript programming, SAS advanced programmer.