All about “Azure Storage”

Rohit Veryani
ILLUMINATION
Published in
7 min readMar 5, 2023

Azure is cloud storage that allows you to have access to data anywhere and anytime as long as you have internet connectivity. A Storage account is an Azure resource and is included in the resource group. Azure Storage is massively scalable, so you can store and process hundreds of terabytes of data to support big data scenarios. Azure Storage uses an auto-partitioning system that automatically load-balances your data based on traffic. Azure Storage supports clients using a diverse set of operating systems (including Windows and Linux) and a variety of programming languages (including .NET, Java, and C++) for convenient development. Only data services from Azure storage can be included in the storage account like — Blob, Tables, Queues, and File. Cosmos DB and SQL DB are separate data services. To manage data services easily, it has been grouped within the storage.

Creating a storage account:
A storage account can be deployed in multiple ways — Azure portal, PowerShell, Azure CLI, and ARM template.
Here we would be focussing on creating a storage account using the Azure portal.
a) search storage account on ‘https://portal.azure.com’, click CREATE

b) Under the BASICS tab, choose a resource group or create a new one for the storage account. STORAGE ACCOUNT NAME should be globally unique.
b) (i) PERFORMANCE field has two options — Standard and Premium. Standard storage accounts are backed by magnetic drives and provide a low cost per GB whereas Premium storage accounts are backed by solid-state drives. Performance settings can’t be changed later once applied.
b) (ii) REPLICATION is the last option. To maximize data availability, redundancy is of trivial importance. At least three copies are always created of user data. Data Redundancy is further divided into several types — LRS, GRS, ZRS, etc.

c) ADVANCED tab has three essential sections namely — Datalake Storage Gen2, Blob storage, and Access Tiers. We would be choosing the default setting as of now. General purpose v2 storage account is a basic storage account that can be used to host blobs, files, queues, and tables and is recommended by Microsoft for most scenarios that require Azure storage. General purpose v1 storage is a legacy-type account that can also host blobs, files, queues, and tables. Blob storage is a legacy account that is used for blob-only storage. It’s recommended that instead of using a blob storage account, general-purpose V2

d) Default settings chosen for NETWORKING, DATA PROTECTION, and ENCRYPTION tab.

e) Next tab is TAGS which is used from a management point of view and has no technical definition.

f) REVIEW tab is the final step wherein a user can review the chosen settings before finalizing the deployment.

Data Replication:
Data in the storage account is replicated to ensure durability that is also highly available, meeting the Azure Storage SLA even in the face of transient hardware failures. Azure has several options for replicating the data in the storage account:
Locally redundant storage (LRS) maintains three copies of your data. LRS is replicated three times within a single facility in a single region. LRS protects your data from normal hardware failures, but not from the failure of a single facility. LRS is offered at a discount.
Zone-redundant storage (ZRS) maintains three copies of your data. ZRS is replicated three times across two to three facilities, either within a single region or across two regions, providing higher durability than LRS. ZRS ensures that your data is durable within a single region. ZRS provides a higher level of durability than LRS. ZRS is currently available only for blobs. Once you have created your storage account and selected zone redundant replication, you cannot convert it to use any other type of replication or vice versa.
Geo-redundant storage (GRS) is enabled for your storage account by default when you create it. GRS maintains six copies of your data. With GRS, your data is replicated three times within the primary region and is also replicated three times in a secondary region hundreds of miles away from the primary region, providing the highest level of durability. In the event of a failure at the primary region, Azure Storage will failover to the secondary region. GRS ensures that your data is durable in two separate regions.
Read-access geo-redundant storage (RA-GRS) provides all of the benefits of geo-redundant storage, and also allows read access to data at the secondary region if the primary region becomes unavailable. Read-access geo-redundant storage is recommended for maximum availability in addition to durability.

Blob Storage:
It’s an object storage solution for the cloud. It’s optimized for storing massive amounts of unstructured data such as text or binary data and has no restrictions on the kind of data it can hold. It can be used to provide data publically or can also keep it private.
A Blob container is like a partition on the drive.
Any file that’s loaded under the Blob container is known as BLOB. A blob can be any type of text or binary data, such as a document, media file, or application installer.
Sample URL associated with a blob — https://storageAccountName.blob.core.windows.net/blobContainerName/blobName
There are mainly three types of blobs- Block Blobs, Append Blobs, and Page Blobs.
Block blobs store text and binary data. It can have 50k blocks of up to 100MB each.
Append blobs are similar to block blobs but are ideal for scenarios such as logging data from VMs. It can have 50k blocks of up to 4MB each.
Page Blobs are used to store virtual hard drive files and serve as disks for Azure virtual machines. They are a maximum of 8TB in size.

Blob Storage Lifecycle:

Every data has different requirements throughout its lifecycle. Some stay forever and some are required for a limited amount of time. Some are accessed frequently while others may not be. Also, there could be a chance the data that is now being accessed very frequently and is kept under a hot tier that might be required to move into an archive or to delete. The process of uploading the data, changing tiers, and at the end either archiving or deleting the same is called the lifecycle of data.
Within the storage account, the Blob service section provides the Lifecycle Management option. Azure blob storage lifecycle management offers a rule-based policy for General Purpose version 2 and blob storage accounts. Such policies on the data can be used to appropriately access the tiers or expire at the end of the data’s lifecycle. Lifecycle management policy allows to switch blob tiers, delete blobs or blob versions at the end of their lifecycles, run defined rules on schedule over storage level, and apply rules over a container and the subset of blobs, etc.
Lifecycle management is available under the Azure portal as well. Pic added. Once data moves from the hot to the cool tier, a user is charged for 30 straight days. The Archive tier is charged for a minimum of 180 days.

File Share:
They are synonymous with traditional network drives. File share has a size of 5TBs. Azure files is a fully managed file service.
A file share can be created on the cloud and several applications, on prem machines, virtual machines can access Azure file shares.
Azure files enable users to set up highly available network file shares that can be accessed using standard Server Message Block (SMB) protocol. Multiple VMs can access the same file having read and write access simultaneously. A user can access the file from anywhere using a URL that points to a file and includes a Shared Access Signature token.
Many on-prem applications use such features to keep minimal data locally by caching and keep the rest in the cloud using Azure file sync. File share can be mounted on the server. Since file share supports mounting, it makes things easier for the application that shares data in Azure.
Another benefit of a file share is that config files can be stored on it and are accessed from other VMs to keep in sync. Tools and utilities can be stored in a file share so that other developers can download from the same.
File Share has four tiers- Premium, Transaction Optimized, Hot, and Cool.
Premiums are backed by SSD which provides high performance and low latency. Premium is used in a scenario like website hosting etc.
Transaction Optimized is backed by HDD and has lower performance than SSD.
The Hot file share is suitable for scenarios like team sharing.
The Cold file share is suitable for scenarios like online archiving.

Azure Queue:
Queue storage provides reliable messaging for workflow processing and communication between components of cloud services.
Azure Queue is designed to store a large number of messages that can communicate with the component of various distributed applications.
These can be accessed via HTTP and HTTPS. They can be up to 64kb in size. A single queue is capable of processing 2k messages/sec having a default Time To Live of 7 days.
A Queue name must be in lowercase.
It has several components — URL format (HTTP/HTTPS), Storage Account, Queue, and Message.
Sample URL format- https://storageaccountname.queue.core.windows.net/queuename

Azure Table storage:
Azure table storage is a service that’s used to store nonrelational data (No SQL data) and is very cheap compared to other resources.
It provides a key/attribute store with a schemaless design and is generally used for web applications, address books, storing huge data, storing datasets that don’t require complex joins, foreign keys, etc. Table storage allows the storage of all kinds of entities in a table.
Components of Table storage include - Storage, Table, and Entity.
A table is a collection of entities and an entity can be up to 1MB in size.
An entity has three properties — partition key, row key, and timestamp.
Sample URL — https://staorageaccountname.table.core.windows.net/tablename

--

--

Rohit Veryani
ILLUMINATION

I am a tech enthusiast who loves to experiment and fond of implementing things that have learned. Have 9+ years of exp into Analytics and Data Engee domain