What’s the recipe for a healthy Data Model in Amazon Redshift?

Anjali Warier
Jun 7 · 3 min read

Ingredients:

Compression Encoding
Sort Keys
Distribution-style

Recipe:

1. While you are creating your tables, use compression encoding in all your columns. Compression allows more data to be stored within a cluster. It improves performance by decreasing I/O. To determine what compression encodings are best for your table, Create the table without any compression and then Load the table with a sizeable amount of data and run the below:

ANALYZE COMPRESSION enterprise_db.product;
CREATE TABLE enterprise_db.product(
product_id int encode zstd,
location varchar(32) encode bytedict,
load_dt date encode zstd);
CREATE TABLE enterprise_db.product(
product_id int,
location varchar(32) encode bytedict,
load_dt DATE encode zstd
) SORTKEY(product_id,load_dt);
Distribution Styles
SELECT diststyle,skew_rows
FROM svv_table_info WHERE “table” = ‘your_table’;
CREATE TABLE enterprise_db.product(
product_id int,
location varchar(32) encode bytedict,
load_dt DATE encode zstd
)
DISTKEY(product_id)
SORTKEY(product_id,load_dt);
SELECT count(1) FROM schema_name.your_table
SELECT count(*) FROM schema_name.your_table

Slalom Technology

Thought leadership from technologists at Slalom.

Anjali Warier

Written by

Slalom Technology

Thought leadership from technologists at Slalom.