Instance file storages and file sharing on Microsoft Azure and Google Cloud Platform

Anton Klimenko
Cloud recipes
Published in
6 min readSep 4, 2017

Cloud computing is an essential part of software development today. There are a huge support and an amazing community around AWS services. But its closest rivals Microsoft Azure and Google Cloud Platform (GCP) are undeservedly ignored. In this article, I will review and compare instance file storage and file sharing options provided by Microsoft Azure and GCP.

Microsoft Azure

Instance storage

Figure 1. Microsoft Azure storage infrastructure

By default when Azure Instance starts it has an operating system disk and a local disk. Multiple data disks can be attached to a virtual machine at any time. But an amount of the data disks you can attach to VM depends on the instance type.

Persistent storage

Operation system and data disks are persistent. These disks are charged at standard disk rate.

The OS disk, its image, and data disks are represented as Virtual Hard Disks (VHD) in an Azure Storage Account. Where Azure Storage Account is backed by either standard (HDD) or premium (HDD) disks. There is also a 2TB capacity limit for the OS disk.

Table 1. Premium disk vs Standard disk (please, visit official site for more information)

Disks can either be managed or unmanaged. With unmanaged disks, you create your own storage account. And then specify that storage account when setting the disk. In this case, you are responsible for the storage account scalability and performance. For example, if you put too many disks into one storage account you can exceed IOPS limit, resulting in VMs being throttled.

Managed disks handle the storage account management in the background. You don’t need to worry about scalability and performance.

Temporary storage

Local disk is temporary, you will loose all data on it when you resize, shutdown or restart your VM. Its storage capacity depends on the instance type. There are no extra charges for the local disk, its price included into the instance price.

Table 2. Local disk by VM type (please, visit official site for more information)

Local disk located on the same physical hardware which is hosting VM. Thus it has higher IOPS and lower latency in comparison to the data disk.

Shared file storage

Microsoft Azure provides File Storage to share files between applications running on virtual machines. Other shared storage options are Blob storage, Queue storage and Table storage. But these options are out of the scope of the article.

In File Storage resources are available via SMB 3.0 and REST API. Shared Access Signature (SAS) token required to get access to files via REST API. As a storage account owner, you can generate SAS of two types:

  • Service SAS
  • Account SAS

Service SAS grants access to resources in only one of the storage services.

Account SAS delegates access to resources in one or more of the storage services. The token holder can read/write files in shares, and manage shares itself.

Table 3. Azure Data Disks vs Azure Files (please, visit official site for more information)

Durability and availability

Microsoft Azure Disks provides 99.999% availability.

There are four replication options available to choose from at storage account creation:

  • Locally redundant storage (LRS)
  • Zone-redundant storage (ZRS)
  • Geo-redundant storage (GRS)
  • Read-access geo-redundant storage (RA-GRS, default option)
Table 4. Replication strategies (please, visit official site for more information)

Security

Security of data disks

Azure leverages BitLocker feature of Windows and DM-Crypt feature of Linux to provide volume encryption for the OS and the data disks. The solution is integrated with Azure Key Vault for key management.

Security of Shared File Storage

Data can be secured in transit between an Application and Azure by using Client-side encryption, HTTPS or SMB 3.0. Storage Service Encryption provides encryption at rest, handles encryption/decryption and key management. All data encrypted using AES-256.

Google Cloud Platform

Instance storage

Figure 2. Google Cloud Platform storage infrastructure

Google Compute Engines are available in a predefined size and Custom Machine Type. With the latter type, users can configure machines for specific needs. By default, each virtual machine has a single root persistent disk that contains the operating system. When the need arises you can attach one or more storage volumes to your instance.

Persistent storage

There are two types of persistent disks: standard (HDD) and SSD based disks. Each instance can attach a limited number of individual persistent disks up to 64TB of total storage space.

Table 5. SSD disk vs Standard disk

Disk throughput and IOPS changes linearly with disk size. You can scale your performance by doing a resize, which requires little to no downtime. Remember, you can only scale up.

Temporary storage

Local SSDs are physically attached to the server hosting the VM instance. Thus, local storage provides high IOPS level and low latency. Each local SSD is 375 GB in size. You can attach up to eight local SSD devices for 3 TB of total local SSD storage space per instance.

Local disk is available through two interfaces: SCSI and NVMe. The performance depends on these interfaces.

Table 6. SCSI vs NVMe local disk (please, visit official site for more information)

RAM disks

Google also provides an ability to create RAM disk when the application requires low latency and high throughput. RAM disks are located in a system memory.

Shared file storage

The following shared file storages or fillers (Google’s terminology) are available in GCP:

  • Persistent Disk in read-only mode
  • Single Node File Server
  • GlusterFS
  • Avere vFXT
Table 7. Shared file storage comparison

You can attach one persistent disk to several instances in read-only mode. In this case, virtual machines will get read access to the same shared source. This solution does not require any file servers.

Single Node File server is a dedicated compute engine, configured as a file server. After it starts you can mount your shares via NFS or SMB from any host on the local subnet

GlusterFS is an open-source distributed file system. It has three volume types: distributed, replicated, and striped. You can also create combined type. It provides cross-zone and cross-region replications.

Avere vFXT scale-out NAS for the cloud. It’s the best for reading performance. It also unifies the storage of on-premises devices and storage arrays into an extensible filer with a single namespace.

Durability and availability

Persistent disks have built-in redundancy. Also, persistent disks snapshots are available. Snapshots are incremental and can be used to start/move data to a new persistent disk.

Security

Compute Engine automatically encrypts data before it sent to persistent disk storage space. Each persistent disk encrypted either with system-defined keys or with customer supplied keys. Data encrypted at rest using AES-256, and each encryption key is itself encrypted with a regularly rotated set of master keys.

Also, persistent disk data automatically distributed across multiple physical disks in a manner that users do not control.

Compute Engine automatically encrypts local SSD storage. You cannot use custom encryption keys with local SSDs.

Conclusion

Microsoft Azure provides a wide variety of predefined instance types and storage options. Managed storages are also available. If the use-case includes .Net technology then Azure definitely should be considered as a potential cloud provider.

On the other hand, GCP provides more flexibility in storage and instance configurations. That can be useful for small projects with a limited budget.

--

--