Getting Most Out of Redshift With Composite Joins

Image for post
Image for post
Query Plan from Redshift

Overview

Redshift Core Principle(s)

Tricking Redshift to not distribute data

table_aid1 integer encode az64 
,id2 integer encode az64
,id3 integer encode az64
,id4 integer encode az64
,snapshot_date timestamp encode raw
distkey(id1)
sortkey(snapshot_date)
table_bid1 integer encode az64
,id2 integer encode az64
,id3 integer encode az64
,id4 integer encode az64
,code varchar(100) encode BYTEDICT
,snapshot_date timestamp encode raw
distkey(id1)
sortkey(snapshot_date)
table_c
id1 integer encode az64
,value varchar(100) encode zstd
distkey(id1)
SELECT a.id1
,a.id2
,a.id3
,a.id4
,b.code
,c.value
FROM table_a a
JOIN table_b b ON a.id1 = b.id1
AND a.id3 = b.id3
AND a.id4 = c.id4
JOIN table_c c ON a.id1 = c.id1
WHERE a.snapshot_date = '2020-10-01'
AND b.snapshot_date = '2020-10-01'
CREATE TEMP TABLE a_c
(
id1 integer encode az64
,id2 integer encode az64
,id3 integer encode az64
,id4 integer encode az64
,value varchar(100) encode zstd
,table_a_b_join_hash varchar(50) encode zstd
)
distkey(table_a_b_join_hash)
sortkey(table_a_b_join_hash);
INSERT INTO a_c
SELECT a.id1
,a.id2
,a.id3
,a.id4
,c.value
,FUNC_SHA1(CAST(a.id1 AS VARCHAR(20)) || CAST(a.id3 AS VARCHAR(20)) || CAST(a.id4 AS VARCHAR(20))) AS table_a_b_join_hash
FROM table_a a
INNER JOIN table_c c
ON a.id1 = c.id1
WHERE a.snapshot_date = '2020-10-01';
CREATE TEMP TABLE b
(
id1 integer encode az64
,id3 integer encode az64
,id4 integer encode az64
,code varchar(100) encode bytedict
,table_a_b_join_hash varchar(50) encode zstd
)
distkey(table_a_b_join_hash)
sortkey(table_a_b_join_hash);
INSERT INTO b
SELECT id1
,id3
,id4
,code
,,FUNC_SHA1(CAST(id1 AS VARCHAR(20)) || CAST(id3 AS VARCHAR(20)) || CAST(id4 AS VARCHAR(20))) AS table_a_b_join_hash
FROM table_b
WHERE snapshot_date = '2020-10-01';
SELECT a.id1
,a.id2
,a.id3
,a.id4
,b.code
,a.value
FROM a_c a
JOIN b
ON a.table_a_b_join_hash = b.table_a_b_join_hash;

The Startup

Medium's largest active publication, followed by +756K people. Follow to join our community.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store