Data Vault Mysteries… Zero Keys & Ghost Records…

Patrick Cuba
Snowflake
Published in
10 min readApr 5, 2021

--

In today’s episode of Data Vault Mysteries we demystify Zero Keys and Ghost Records!

The two have been mentioned in literature and often their purpose has defined interchangeably; however they are in Data Vault 2.0 two very different concepts serving two very distinct purposes! Join me in the classroom as we unveil what each are and how they should be thought of in the context of data modelling! First up…

Ghost Records

A ghost record has no value, no meaning, it is a ghost! A single ghost record is inserted into a satellite table right after the table is created and forgotten about. When running a query between a hub table and its satellite or a link table and its satellite no ghost record is returned; no ghost will exist in the hub or link. So why then, do we insert a ghost record into a satellite table and simply forget about it? The answer is simple if you understand database table indexes and SQL join conditions. There are several ways you can join two relational tables together,

· LEFT JOIN returns all records from the left table (table1), and the matched records from the right table (table2). The result is NULL from the right side, if there is no match

· RIGHT JOIN returns all records from the right table (table2), and the matched records from the left table (table1). The result is NULL from the left side, when there is no match

· EQUI or INNER JOIN selects records that have matching values in both tables

Figure 1 Learn more by visiting https://www.w3schools.com/sql/sql_join.asp

Performing joins of two tables together without an index forces the SQL query optimizer to perform a full scan of the contents of each table to return which record meets the join condition you chose. Now if both tables included in the join condition are indexed then the SQL optimizer knows where to find the records to join rapidly! For those non-matched records in a LEFT JOIN, the table on the LEFT retains all the records and the record attributes from the right table that do not match are not returned. A much faster query is if we return all records that match, see bit.ly/3knWleo.

--

--

Patrick Cuba
Snowflake

A Data Vault 2.0 Expert, Snowflake Solution Architect