Building a Data Lake on Azure

How to use Azure Data Lake Storage Gen2

Christianlauer
CodeX

--

Data Lakes have now arrived in the corporate world and Microsoft Azure offers you a solid foundation here — learn how to set up such a Data Lake and what advantages might come with it.

First of all, to make data available in a Data Lake at all, you have to integrate it there. Here, a data integration tool is important that can open up as many formats and sources as possible. Examples are Alteryx or talend, but Microsoft also offers Data Factory in the Azure world, for example.

Data Lake Architecture in Azure — Image Source: InfoQ[1]

The data then ends up in Azure Data Lake. With Data Lake Storage Gen2, Azure Storage becomes the foundation for creating enterprise Data Lakes in Azure. Designed specifically to handle multiple petabytes of information while supporting hundreds of gigabits of throughput, Data Lake Storage Gen2 gives you an easy way to manage massive amounts of data [2]. Unlike relational databases and classic Data Warehouses, you can also store unstructured and semi-structured data here.

--

--

Christianlauer
CodeX

Big Data Enthusiast based in Hamburg and Kiel. Thankful if you would support my writing via: https://christianlauer90.medium.com/membership