Data Vault 2.0 — Point in Timetable

Faris Hussain
4 min readDec 6, 2023

--

Point In Timetable — I

In Data Vault 2.0, there are query Query Assistance tables that are being used to reduce the complexity of the queries against the data vault and improve its performance. Such as point-in-timetable, its primary purpose is to streamline data retrieval from the data vault, ultimately boosting query performance and making data exploration more efficient.

Point-in-timetables cover the problem when there are multiple satellites on the hub or link and when you are querying the data out of it. The problem arises because the changes to the business objects stored in the source system don’t happen at the same time.

Consider HUB_Customer having multiple satellites named SAT_NAME, SAT_ADDRESS, and SAT_CONTACT. These are being split based on rate of change, as we know that different satellites can be based on source, data type, rate of change, etc.

Modeled using SQLDBM.

Here, we guess that the customer's phone number probably changed more often than the customer's name. Therefore, we don't want the entire row to get loaded every time there is a change in the phone number.

Now the data on each satellite is as follows:

SAT_NAME:

Customer Name Satellite

SAT_ADDRESS:

Customer Address Satellite

SAT_CONTACT:

Customer Contact Satellite

If the end user wants to know the current location of the customer, then, in the subquery, you have to select the max LOAD_DATE of SAT_ADDRESS and join with the HUB. In this way, you will get the Customer_ID business key and the latest address of the customer from SAT_ADDRESS.

Current View:

SELECT
H.Cust_ID, SA.City, SC.Phone
FROM HUB_Customer H
JOIN SAT_ADDRESS SA
ON H.HK_Cust_ID = SA.HK_Cust_ID
WHERE SA.LOAD_DATE =
(SELECT (MAX(S2.LOAD_DATE)
FROM SAT_ADDRESS S2
WHERE H.HK_Cust_ID = S2.HK_Cust_ID)

The query will be more complicated if you have to add additional join clauses for every additional satellite table accessed. Moreover, if historical data retrieval is also required, the complexity of the query further escalates.

Historical View:

SELECT
H.Cust_ID, SA.City, SC.Phone
FROM HUB_Customer H
JOIN SAT_ADDRESS SA
ON H.HK_Cust_ID = SA.HK_Cust_ID
WHERE SA.LOAD_DATE =
(SELECT (MAX(S2.LOAD_DATE)
FROM SAT_ADDRESS S2
WHERE H.HK_Cust_ID = S2.HK_Cust_ID
AND S2.LOAD_DATE <’03–01–2018')

Here, Point-In-Timetable helps simplify SQL queries. The point-in-timetable consists of the hash keys of the customer and the point-in-time LOAD_DATE (PIT_LOAD_DATE) for which to keep the record from each satellite, along with the recent LOAD_DATE from all the satellites attached to the HUB with respect to the PIT_LOAD_DATE. As shown in the diagram below,

Point-In-Timetable — PIT_CUSTOMER

These are the records filled in the PIT_CUSTOMER table, as follows:

For each point in time (PIT_LOAD_DATE), the maximum LOAD_DATE from each satellite is being extracted and loaded.

Now we want to report for 4/1/2018 and assume today is 6/1/2018. We will take the business key, HK_Cust_ID, from Customer_HUB, join it with the point-in-time HK_Cust_ID, and look for the PIT_LOAD_DATE for the date we want to report, i.e., 4/1/2018. Now, with PIT_LOAD_DATE 4/1/2018, we will get the maximum LOAD_DATE of each satellite for 4/1/2018. With the LOAD_DATE of each satellite, we will join the LOAD_DATE with the respective satellite and get the information for that time.

In conclusion, the Point-In-Time Table addressing the complexities of querying historical data, it empowers Data Vault 2.0 users to navigate their data with greater ease and precision, marking a significant stride in the journey towards more efficient, accessible, and powerful data warehousing.

— — Added null value in the PIT table instead of incorporating ghost records in Satellites. In order to introduce basic of PIT in Data Vault, Here I refrained from delving into the details of ghost records. Null shouldn’t be in PIT. Please for the detail & well explanation visit this article by Data Vault Guru Patrick.
https://www.linkedin.com/pulse/data-vault-mysteries-zero-keys-ghost-records-patrick-cuba/

--

--