Achieving real-time analytics via change data capture
Ofir Sharony
2015

Quick question:

How would you analyze CDC data in a data warehouse?

Instead of having one record for each key you have multiple records, each representing a change in the record.

This complicates simple SQL queries where you want to ask questions about the current state of the dataset. The only way I can think of solving it, is for each query, first group the data by key and output only the records belonging to the latest change. Only then you can query.

Am I missing something?