Small Data vs Big Data

Yash Joshi
Analytics Vidhya
Published in
3 min readNov 18, 2020

Other than 3V’s!

Well, it is common and you all must be aware that Big Data is mainly defined by 3V’s i.e, variety, velocity, and volume.
VOLUME: The amount of data is huge.
VARIETY:
Contains multiple forms of data.
VELOCITY: Huge amount of streaming data is continuously analyzed in near real-time.
But there is something more that differentiates big data from small data. Let’s have a look over these:

GOAL: Small data helps to accomplish a single task by analyzing the data. Whereas in Big data, the goal evolves and redirects to some unexpected situations. We can have one specific goal in the beginning over time it changes.

Large data is distributed across different servers over the cloud.

LOCATION: Small data is located usually in one place in a single file in either a database or a local PC. But big data is spread across multiple servers over the cloud in multiple locations.

STRUCTURE: Big data can be semi-structured or unstructured across many sources as compared to structured small data which is available in a single table.

DATA PREPARATION: Small data is usually prepared by end-users for their own specific goals. So the person who is putting the data knows the use of it and what to get from data. On the other hand, big data is prepared by a group of people who may not be the end-users. So, the coordination required to process the data is very much advanced.

Big data, big storage

LONGEVITY: Small data can be kept for a limited amount of time or we can say till the project lasts. But infinite time storage is the requirement for big data.

REPRODUCIBILITY: If by chance a data is lost or having any faults then it can be made again in case of small data, but replication in big data is not feasible. So any bad data if to be removed should be carefully studied and analyzed before deleting.

STAKES RISK: In small data the risks are limited. Whereas in big data, risks are enormous as money, manpower, material, and time involved are high.

INTROSPECTION: We get well organized, individual data points that are easy to locate with clear metadata giving clear information of all the columns in case of small data. But in big data, many files in many formats can be difficult to locate. If not properly documented then it’s difficult to interpret the data.

Many different forms of analysis can be performed based on a variety of data.

ANALYSIS: Small data can be analyzed in one procedure on one machine. Whereas big data may need to be broken apart, analyzed in steps using different methods in distributed environments.

Well, these were some more differences other than the major 3V’s which I thought of sharing!

Hope you like the content!

--

--

Yash Joshi
Analytics Vidhya

A young and dynamic learner with the focus to gain knowledge in the data-driven world.