Understanding and Assessing Storage Requirements

Sneha Biradar
Google Cloud - Community
4 min readDec 14, 2022

Storage is a process through which digital data is saved within a data storage device by means of computing technology. Storage can be classified based on many parameters and hence there are numerous types of storage!

Navigating the storage landscape can be tricky — especially if the business storage needs have recently changed and you’re not sure what to do next.

Legacy data platforms have been built over many years and are extremely diverse and complex to understand. Hence, knowing the basic characteristics of storage becomes necessary to assess the storage needs on the path to enhance and transform the way data is stored and managed .

Key Criteria to consider

  • Data format — What format the data is currently in and what format it should be in post-migration (structured / semi structured / unstructured) ?
  • Functionality — What is the data being used for (caching / transactions / analytics, etc) ?
  • Capacity — How much data do you need to store?
  • Scalability — How much data will you need to store 5 years from now?
  • Performance — What Throughput, IOPS & Latency are you looking for?
  • Backup and Recovery — Where will you back up files and how often?
  • Budget — How much do you have to spend?

Storage System vs Database

Storage/File system

Unstructured data stores for storing arbitrary, probably unrelated data and databases are built on top of the general data storage services provided by file systems.

File system is way better if:

  • You like to use version control on your data (a nightmare with DBs)
  • You have big chunks of data that grow frequently (typically, log files)
  • You want other apps to access your data without API (like text editors)
  • You want to store lots of binary content (pictures or mp3s)

Database

Generally used for storing related, structured data, with well defined data formats, in an efficient manner for insert, update and/or retrieval (depending on application).

DB tables are much better when:

  • You want to store many rows with the exact same structure (no block waste)
  • You need lightning-fast lookup / sorting by more than one value (indexed tables)
  • You need atomic transactions (data safety)
  • Your users will read/write the same data all the time (better locking)

Storage access types

It is best to customise and optimise the data migration method for applications based on the way that they are accessing storage. The typical storage input/output (I/O) access types are explained below:

File Storage

Hierarchical storage method used to organise and store data in the form of files and folders.

Protocol: NFS, CIFS or SMB

Advantages: Simplified access and management of shared files, Global File locking

Limitations: Fixed file system attributes for metadata, Limited scalability

Use cases: Hierarchical file systems, shared data among multiple users

Block Storage

Fixed-size blocks store portions of the data in a hierarchical system and reassemble when needed.

Protocol: Fibre Channel, ISCI, FCoE

Advantages: Lowest latency, consistent performance

Limitations: Expensive, No Metadata capabilities

Use cases: Structured data, transactional, underlying architecture for databases

Object Storage

Uniquely identifiable and distinct units called objects store data in a flat file system.

Protocol: REST and SOAP over HTTP

Advantages: Cost effective, Ability to handle Metadata rich, large scale analytics, highly scalable

Limitations: Not suited for frequently changing data as the whole object has to be rewritten

Use cases: Static unstructured data, read-heavy data, rich media files, backup files

Storage device configurations types

To store data, regardless of form, users need storage devices. Data storage devices come in following main categories:

DAS

Direct area storage, also known as direct-attached storage (DAS) refers to storage devices that are often in the immediate area and directly connected to the computing machine accessing it through one of the common interfaces, such as SATA, PCIe, USB, or Thunderbolt.

Advantages: Easy setup, Low cost and High performance

Limitations: Limited accessibility, Limited scalability, No central management and backup

Use cases: Budget constraints, A simple storage solution, For small businesses that only need to share data locally.

NAS

NAS (network-attached storage) solution is commonly deployed as a file level data storage device connected through the local area network (LAN) providing data access to a group of clients on the network.

Advantages: High scalability, Greater accessibility and Higher Performance

Limitations: Increases LAN traffic, Performance limitations, Security and reliability

Use cases: File Storage and Sharing, Big Data, SMBs and organizations that need a minimal-maintenance, reliable and flexible storage system.

SAN

A storage area network (SAN) is a dedicated, high-performance storage system that transfers block-level data between servers and storage devices.

Advantages: Improved performance, Greater scalability, Improved availability and resilience.

Limitations: Expensive, more complicated to set up and maintain

Use cases: Database Management Systems, Virtualisation, for mission- critical files or applications at data centers or large-scale enterprise organisations.

Latest Advancements

Unified SAN

Like any storage technology, SANs are undergoing a transition. For example, vendors now offer something called unified SAN, which can support both block-level and file-level storage in a single solution.

vSAN

Other technologies are also emerging for bridging the gap between NAS and SAN. One example is VMware vSphere, which makes it possible to use NAS and SAN storage in the same cluster as vSAN, VMware’s virtual SAN technology.

Cloud

Although NAS and SAN systems have served as the backbone for most enterprise applications, many organisations are now turning to the cloud to meet their storage needs. Some of the many advantages being:

  • Cloud vendors provide on-demand storage services with pay-as-you-go subscription models, helping to avoid over-provisioning and extensive up-front costs.
  • Cloud platforms are highly scalable, easy to manage, provide metered resources, and include built-in redundancy.

Google Cloud offers the best options in industry in terms of storage and database services along with flexible solutions that help you migrate your data to the cloud while modernising and innovating at your own pace!

Stay tuned for more to follow on Storage Migrations to GCP soon!

--

--