Data Warehouse vs. Data Mart vs. Data Lake

What is what and where are the Differences?

Christianlauer
CodeX

--

Photo by Salmen Bejaoui on Unsplash

In addition to Data Warehouses, which are now firmly established in the corporate world, Data Lakes are also becoming increasingly common, and one often hears of Data Marts, but what is actually what?

The Data Warehouse

The Data Warehouse is an analytical, usually relational database (SQL) or hybrid system (Mix of SQL and NoSQL) created from different data sources. The goal is usually to store historical data for later analysis. Data Warehouses often have extensive computing and storage resources for running complicated queries and generating reports. They are often used as data sources for business intelligence and machine learning systems. New approaches, technologies and especially the cloud are changing the field a lot and offer new opportunities. Data Warehouses are classic relational systems that work with structured data. Exceptions are new cloud-based Data Warehouse technologies such as BigQuery or Snowflake which can also work with unstructured data and are column-based.

The Data Lake

The Data Lake, on the other hand, is a large pool of raw data for which no use has yet been determined. A Data Warehouse is a repository for structured, filtered data that has already been…

--

--

Christianlauer
CodeX

Big Data Enthusiast based in Hamburg and Kiel. Thankful if you would support my writing via: https://christianlauer90.medium.com/membership