Snowflake: ASOF JOIN

Ivan Jimenez
SDG Group
Published in
5 min readApr 21, 2024
Photo by Markus Winkler on Unsplash

Some weeks ago Snowflake announced a new feature that will be received very well by analysts although I am completely sure that everyone will be able to find their own purpose. Im talking about ASOF JOIN. This was announced in this release:

https://docs.snowflake.com/en/release-notes/2024/other/2024-02-28

This functionality is quite used in the data analysis field when mainly we are comparing or we want to compare two different time series. This type of comparative analysis without the use of ASOF JOIN can be done using different ways, but one of them, for example, is by using windows functions, although in some cases the query could be complex.

ASOF JOIN is a means of joining tables with time-series data when corresponding timestamp columns contain values that do not match exactly. For each row in the left table, the join finds the closest matching value from the right table.

ASOF JOIN syntax is specified in the FROM clause of a SELECT statement. Although ASOF JOIN queries can be emulated through the use of complex SQL, other types of joins, and window functions, queries are easier to write (and are often more performant) if you use the ASOF JOIN syntax.

The key capability of this join method is the analysis of two or more time series to find the closest preceding or following record that matches a given criterion. Therefore, ASOF JOIN is useful for analyzing various data sets, such as financial trading data, weather observations, readings from sensors, and audit trails. In all of these use cases, ASOF JOIN may be used to associate data when records from different sources have timestamps that are not exactly the same.

The theory always is needed, but personally, I always prefer to test the functionality in a practical way. What is better than a simple use case where everybody can understand the behavior and from there, everyone can let their imagination fly and adapt it to their needs?

Let’s get to practice!

For this use case, we are going to use a Pharmacy as an example. Why a Pharmacy? simply was the first thing that came to my mind, and as I commented to you previously, everyone adapts this to their use case.

Consider that a Pharmacy has two tables in its simple database where it registers, on the one hand, the “products” it has in its inventory to sell with the prices and, on the other hand, the “sales” registered by product and quantity.

CREATE OR REPLACE TRANSIENT TABLE PRODUCT (
ID_PRODUCT NUMBER,
NAME_PRODUCT VARCHAR,
UNIT_PRICE NUMBER(18,2),
PRICE_UPDATE_DATE TIMESTAMP_TZ
);

INSERT INTO PRODUCT
VALUES
(1,'MEDICATION_SAMPLE_1',5, TO_TIMESTAMP_TZ('2024-04-15 10:00:00')),
(2,'MEDICATION_SAMPLE_2',5, TO_TIMESTAMP_TZ('2024-04-15 10:00:00')),
(3,'MEDICATION_SAMPLE_3',5, TO_TIMESTAMP_TZ('2024-04-15 10:00:00')),
(4,'MEDICATION_SAMPLE_4',5, TO_TIMESTAMP_TZ('2024-04-15 10:00:00')),

(1,'MEDICATION_SAMPLE_1',7.5, TO_TIMESTAMP_TZ('2024-04-20 10:00:00')),
(3,'MEDICATION_SAMPLE_3',8.5, TO_TIMESTAMP_TZ('2024-04-20 10:00:00'))
;
Product table

As we see in the previous table of “products”, for the same product they may have different prices associated with a different product update date. All the products have at the beginning the same price of $5, but the products “MEDICATION_SAMPLE_1” and “MEDICATION_SAMPLE_3” updated their prices on 2024–04–20 to $7.5 and $8.5 respectively.

CREATE OR REPLACE TRANSIENT TABLE SALES (
ID_SALES NUMBER,
ID_PRODUCT NUMBER,
SOLD_QUANTITY NUMBER,
SOLD_DATE TIMESTAMP_TZ
);

INSERT INTO SALES
VALUES
(1,1,2, TO_TIMESTAMP_TZ('2024-04-15 11:00:00')),
(2,2,5, TO_TIMESTAMP_TZ('2024-04-16 12:00:00')),
(3,2,3, TO_TIMESTAMP_TZ('2024-04-17 13:00:00')),
(4,3,1, TO_TIMESTAMP_TZ('2024-04-18 14:00:00')),
(5,3,2, TO_TIMESTAMP_TZ('2024-04-19 15:00:00')),
(6,3,1, TO_TIMESTAMP_TZ('2024-04-20 16:00:00')),
(7,1,3, TO_TIMESTAMP_TZ('2024-04-20 17:00:00')),
(8,1,4, TO_TIMESTAMP_TZ('2024-04-22 18:00:00'))
;
Sales table

For its part, the “sales” table, registers the quantity of sold products and the date of sale.

Now it's the analyst's turn!

The analyst of this Pharmacy needs to get a list of all sales with their total cost per sale, taking into account the price of each product at the time of sale. As we saw in the previous data, there are products that have changed their price during the time, and because of that the total cost associated will change.

Before learning about this new feature, this analyst got the information previously commented on by using the next query with a windows function:

SELECT
SALES.ID_SALES,
SALES.ID_PRODUCT,
SALES.SOLD_QUANTITY,
SALES.SOLD_DATE,
PRODUCT.NAME_PRODUCT,
PRODUCT.UNIT_PRICE,
PRODUCT.PRICE_UPDATE_DATE,
SALES.SOLD_QUANTITY*PRODUCT.UNIT_PRICE AS TOTAL_COST
FROM SALES
LEFT OUTER JOIN PRODUCT
ON SALES.ID_PRODUCT = PRODUCT.ID_PRODUCT
AND SALES.SOLD_DATE >= PRODUCT.PRICE_UPDATE_DATE
QUALIFY ROW_NUMBER() OVER (PARTITION BY SALES.ID_PRODUCT, SALES.SOLD_DATE ORDER BY PRODUCT.PRICE_UPDATE_DATE DESC) = 1

After learning about this new functionality, he decided to simplify more his query by using the ASOF JOIN in the next query:

SELECT 
SALES.ID_SALES,
SALES.ID_PRODUCT,
SALES.SOLD_QUANTITY,
SALES.SOLD_DATE,
PRODUCT.NAME_PRODUCT,
PRODUCT.UNIT_PRICE,
PRODUCT.PRICE_UPDATE_DATE,
SALES.SOLD_QUANTITY*PRODUCT.UNIT_PRICE AS TOTAL_COST
FROM SALES
ASOF JOIN PRODUCT
MATCH_CONDITION(SALES.SOLD_DATE >= PRODUCT.PRICE_UPDATE_DATE)
ON(SALES.ID_PRODUCT = PRODUCT.ID_PRODUCT)
ORDER BY 1
Total cost per sale
Total cost per date

Conclusion

With this easy example, we see that the ASOF JOIN comes to help simplify the analysis of data mainly when we use time series comparison.

Now, you can say why they didn't include the total cost in their sales table taking into account the price at the time of the sale but as the title of this article says, we wanted to show the ASOF JOIN functionality.

Yes, yes, yes, I know that this is a very simple example!

THANKS!!

Who I am?

I am an enthusiast of new technologies and a data lover. Working with data I feel comfortable and I am always constantly seeking to learn and improve. Currently, I am working as a Data Management Specialist. If you want, you can follow me on LinkedIn.

Linkedin: https://www.linkedin.com/in/iv%C3%A1n-jim%C3%A9nez-s%C3%A1nchez-23496a91/

Bonus

In addition to the query, if we check the query profile, we also see how it is different!

Query without ASOF JOIN
Query with ASOF JOIN

--

--