Book Review: Data Modeling with Snowflake by Serge Gershkovich — Lessons Beyond the Platform
Beyond Snowflake specifics, this book reframes modeling practices with transformational insights that resonate with modern, model-driven data engineering.
Summary
Serge Gershkovich’s Data Modeling with Snowflake may look like another platform guide, but the real value lies in five reframed modeling insights.
The most important is Transformational Modeling (Tx) — a concept now shaping both theory and tooling (e.g. SqlDBM’s new Tx feature).
Together, these ideas strengthen the shift toward Model-Driven Data Engineering.
Five Reframed Insights
1) Transformational Modeling (Tx)
The standout theme of this book is treating transformations as first-class model artifacts. Instead of separating schema design and ETL, Gershkovich argues that joins, filters, aggregations, lineage, and optimization patterns belong in the model.
Book example: Reverse balance fact tables are recalculated incrementally through transformations rather than stored as static aggregates. SCD2 changes are maintained by embedded logic, not just hidden pipelines.
Sample SQL:
WITH daily_balances AS (
SELECT account_id, SUM(amount) as balance, transaction_date
FROM transactions
GROUP BY account_id, transaction_date
),
cumulative AS (
SELECT account_id, transaction_date,
SUM(balance) OVER (PARTITION BY account_id ORDER BY transaction_date) as running_balance
FROM daily_balances
)
SELECT * FROM cumulative;
Connection to SqlDBM: Gershkovich is also linked to SqlDBM, which launched Transformational Modeling (Tx) in 2024. SqlDBM positions Tx as the missing complement to relational modeling, bringing both into a single tool.
- Tx objects include CTAS statements, views, templates for repeatable patterns.
- Lineage is automatically tracked, showing how columns are derived.
- Collaboration, governance, and version control are built in.
See SqlDBM’s overview article and press release.
Connection to our methodology: This resonates strongly with Business-Friendly Mapping. We’ve long argued that transformation rules must be transparent, documented, and reusable. Tx formalizes this view: transformations are models too.
2) Reverse Modeling
Most organizations inherit messy warehouses with no reliable models. Gershkovich promotes reverse modeling — extracting logical/conceptual clarity from physical schemas. This is both pragmatic and essential for rediscovering business meaning.
Relevant to our work on Reverse Engineering Sources and Metadata as Data.
3) Semi-structured Integration
Semi-structured data is no longer exotic — it’s standard. Gershkovich emphasizes systematic approaches to flatten, historize, and integrate JSON, XML, and Avro. Key point: schema-on-read is not schema-free. Semi-structured data still needs consistent logical representation.
This echoes our principles in Source Access Views and Optimizing Historization.
4) Recursive Hierarchy Modeling
Hierarchies are rarely neat. Gershkovich reframes recursion as a modeling strategy for ragged, unbalanced hierarchies. Recursive CTEs allow unlimited depth and flexible historization.
Book example: Employee–manager tree:
WITH RECURSIVE org AS (
SELECT employee_id, manager_id, 0 as level
FROM employees
WHERE manager_id IS NULL
UNION ALL
SELECT e.employee_id, e.manager_id, org.level + 1
FROM employees e
JOIN org ON e.manager_id = org.employee_id
)
SELECT * FROM org;
This complements dimensional bridge tables and provides richer hierarchy support. Strongly connected to Business-Oriented Data Modeling.
5) Domain-driven Perspective
Gershkovich links modeling to Data Mesh thinking: domains own their data, and models reflect business realities, not just system structures. Integration layers should remain business-shaped.
This aligns with our work on Business-Shaped Integration Layers and Universal Data Models.
Comparison Table: Classic vs. Reframed Insights
Key Takeaways
- Transformational Modeling (Tx) reframes transformations as models and is now embedded in mainstream tools like SqlDBM.
- Reverse modeling is essential for rediscovering meaning in legacy systems.
- Semi-structured data must be systematically historized and integrated.
- Recursion is a robust option for handling ragged hierarchies.
- Modeling is becoming domain-driven and business-owned.
Final Thoughts
If you strip away Snowflake specifics, Gershkovich’s book is less about technical configuration and more about reminding us that modeling is evolving. Among the reframed practices, Transformational Modeling (Tx) is the most impactful. It bridges the long-standing gap between schema and transformation logic — and now finds expression in modern modeling tools like SqlDBM.
For those pursuing metadata-first, business-friendly approaches, these insights are not just reminders but accelerators.
Verdict: A book that reinforces the broadening scope of modeling. Not revolutionary, but highly relevant for practitioners building model-driven data platforms.