Introduction to Azure Data Lake Storage and How to Create One!

Maesak Delbar
6 min readJul 3, 2024

--

Azure Data Lake is a service for storing and analyzing big data. It can handle any amount or type of data and allows for powerful data analysis. Key features include strong security, easy integration with other Azure tools like Synapse Analytics and Databricks, and cost-efficiency since you only pay for what you use.

Difference Between Data Lake and Blob Storage:

Azure Data Lake is like a special library for very big collections of information. It helps organize this information neatly into folders and subfolders. It works closely with tools that analyze big data, such as Azure Synapse Analytics and Azure Databricks. It keeps everything safe with very detailed controls over who can access what. It runs really fast, especially when working with large amounts of data. Because it does so much, it can cost a bit more than simpler storage options.

Azure Blob Storage is like a big storage room for keeping all kinds of things. It stores data in containers without organizing it into folders. It’s mainly used for storing things like pictures, videos, backups, and logs. While it can work with other Azure services, it’s not as closely tied to analytics tools as Data Lake.

Steps to Create an Azure Data Lake:

Step 1: Set up your Azure Account

  1. If you don’t have an Azure account, sign up for one at Azure Portal.
  2. Log in to your Azure account.

Step 2: Create a Resource Group

In the Azure Portal, click on “Resource groups” in the left-hand menu.

Click on “Create” to create a new resource group.

Give a unique name for your resource name.

after name , click “Next:Tags>”

Enter a unique name and select a region, then click “Review + create” and “Create”

Now you can see the resource name

Click on your resource and then click ‘Create’ to open ‘Storage Account’.

Now you can see the search bar. Type ‘Storage Account’ there. In the Azure Portal, click on “Storage accounts” in the left-hand menu.

Click on “Create” to create a new storage account

Subscription: Select your Azure subscription. here i used my “AZURE for Student”

Resource Group: Select the resource group you created earlier. which we created before

Storage Account Name: Enter a unique name.

Region: Select the same region as your resource group. ex: (Asia Pacific)

after enter the details click on “Next”

Enable “Hierarchical namespace” to use Azure Data Lake Storage Gen2 features.

HOT & COOL:
The hot tier is best for data you use a lot because it’s quick and efficient, perfect for real-time analytics and regular updates.

The cool tier is for data you don’t use much but need to access quickly when you do. It’s cheaper to store but costs more to get to, so it’s good for backups, archives, and data you might need sometimes for reports or audits.

for more understanding read this .

In networking, it’s asking about public network, private network, and virtual network. I skipped all the details and didn’t select anything; I just put the same thing.

In data production, I didn’t enter any number; instead, it automatically set to 7. This means when you delete files from storage, they are stored in backup for 7 days. You can change this to a maximum of one year (365 days)

For data encryption, I didn’t enter any specific details; it automatically defaults to a standard setting.

Add roles like “Storage Blob Data Owner” or “Storage Blob Data Contributor” to grant appropriate permissions

Click “Review + create”

then “Create” to create the storage account.

After clicking “create,” it takes some time to create the storage as it goes through the deployment process.

open the resource for open the storage account

After opening the storage, you can see your Data Lake Storage. On the left side above, you will find the name of your storage, labeled as “demodls.”

On the left sidebar, click on “Data storage.” Within data storage, you’ll find four options. We’re going to open “Containers” to store our data inside.

Click on “Create container.” After clicking that

on the right side, a box will appear. Give any unique name for that container.

Now you can see your container. Inside, you can store any structured and unstructured data, such as photos, videos, audio files, CSVs, and more.

Open the container and above, you will see the “Upload” option. Click on it, and on the right side, a bar will appear to upload files from your local system.

I chose 4 CSV files for your understanding.

click on “Upload”, then it will be stored in the ADLS container.

Now you can see all the files stored in your Data Lake Storage. You can now use them for analytical purposes or further analysis.

Stay connected, and we’ll catch up next !

If you have any queries or any topic-related suggestions, feel free to reach out to Me via LinkedIn.

--

--

Maesak Delbar

#DataExplorer #AIBeginner #ETL_Pipelines #Gen-AI Engineer