Azure Blob Storage vs Azure Data Lake Storage Gen 2: Explained Like You’re 5

Pratik Mukesh Bharuka
2 min readMay 13, 2023

--

Azure Blob Storage and Azure Data Lake Storage are both places where you can store data on the internet. They are like big digital containers that can hold all kinds of stuff, like pictures, videos, documents, and more.

However, they are designed for different purposes. Azure Blob Storage is like a big warehouse. It can store all sorts of data, from pictures to videos to documents. It’s good for storing data that you need to access quickly, like when you’re working on a project or sharing files with friends.

Azure Data Lake Storage Gen2 is like a big library. It’s good for storing large amounts of data that you need to process with big data tools, like Hadoop and Spark. It’s also good for storing data for machine learning.It is a massively scalable, secure data lake functionality built over Azure Blob Storage which is designed for big data analytics and offers a hierarchical file system.

I am confused? What’s new in Gen2?

Azure Data Lake Storage Gen2 (ADLS Gen2) is an evolution of ADLS Gen1 that combines the capabilities of a hierarchical file system with the low-cost storage of Azure Blob Storage. It provides a unified namespace, allowing you to access both file and object data using a single set of APIs. ADLS Gen2 also supports features such as Azure Active Directory-based access control and data access via Hadoop Distributed File System (HDFS) and Blob APIs.

The key differences between ADLS Gen1 and Gen2 are the underlying storage technology and the access methods. ADLS Gen1 uses a distributed file system, while ADLS Gen2 uses a combination of Azure Blob Storage and a hierarchical namespace. ADLS Gen2 offers a more cost-effective solution for storing large amounts of data, while still providing the same security and access control features as ADLS Gen1.

Summary:

DataLake and BlobStorages are just fancy terms, but you might be familiar with similar services like iCloud and Google Drive. Azure Blob storage is essentially the same thing, but with added security controls. This was later improved when companies started to store massive amounts of data on cloud. Microsoft then added features of BigData such as HDFS and MapReduce methods. This improved version was known as Data Lake

Thats all for today folks! Make sure to follow me for more Azure Data Engineering related content in the future!

--

--