Getting the Big Numbers is All that Matters in Big Data!

Published in

CISS AL Big Data

4 min readDec 10, 2021

Getting the Big Numbers is All that Matters in Big Data!

Some say there are 5 Vs that we should know in understanding big data. Some even say there are technically 18. These numerous Vs sure are the fundamental principles that make big data superior. As the principles of big data have been stated many times, the idea of correlation, the relationship between two variables, is the essential tool big data apply to reach a wanted result. Volume” is the critical component that puts the correlations in big data into actual effect. The volume of data refers to the size of the data sets stored to be analyzed, which are now frequently larger than terabytes and petabytes. The sheer volume of the data requires distinct processing technologies than traditional storage and processing capabilities. This also means that the data sets in Big Data are too large to process with a regular laptop or computer. It is beyond human ability to process the data, and we need technology to create this futuristic model. Let’s begin by interpreting the very name of big data. Big data can generalize a bigger picture of behavior or trend because it is able to extract all the possibilities or predict all from its infinite resources. By storing “big” numbers of data, “big” data can perform its talent at its very best.

“A Data Tsunami Is Coming (and the Alarm Has Been Sounded).” TransFICC, https://transficc.com/news/2020/a-data-tsunami-is-coming-and-the-alarm-has-been-sounded.

Moreover, the power of “big” numbers or a reasonable volume allows big data to achieve the best or prevents big data from being stopped for high achievements. As mentioned above, big data is on its way to success when as much information is collected. It is no doubt that the greater the amount of information is available, this advantage could gain more knowledge. For example, you are more likely to do well in a math test when you have access to all of the practice problems and can grasp more information and tricks because of this availability. On the other hand, although you might have 4 example questions with neat work and explanation notes, this perfect sample would not necessarily help you perform well in a math test with 20 queries. Another example given by the book Big Data by Viktor Mayer-Schönberger and Kenneth Cukier proves the effects of volume is Google having one of the most humanized translator. Google’s language datasets are actually messy instead of organized but because Google used tens of thousands times larger datasets, so many insights can be gained through the messiness, making it a unique tool to lead to Google to success. “Using larger dataset enabled great strides in natural-language processing, upon which systems for tasks like voice recognition and computer translation are based.1” Google’s artificial-intelligence guru Peter Novig further share that their recipe of their advanced translator is: “simple models and a lot of data trump more elaborate models based on less data.”

“How to Implement KCS in 6 Steps — and Get Measurable Results.” TOPdesk Blog EN, 7 Apr. 2021, https://blog.topdesk.com/itsm/knowledge-management/how-to-implement-kcs/.

Big data is not taking fancy materials of a small amount and looking at them over and over. The same conclusions would likely be drawn out, and hence the sample value soon vanishes. But big data could always generate novel insights when extracting large amounts of data sets, although in enormous quantities but decent qualities, from the environment around or on a global scale. The value of big data takes place as the volume of big data is put into effect. The volume of information is not only substantial by being the factor of success. It is also the savior that drags results away from incorrection. By comparing sampling and collecting big data, we know that sampling dislikes “n=all” because this process only wants the most accurate information in a number as small as possible. But wouldn’t that take a long time to work out fine? And how do you precisely distinguish between exact information and not exact information? That would mean a lot of rubrics to be distributed. The volume that big data calls for is precisely the opposite. We want “n=all.” In getting all the information and receiving many data, we are not afraid of inaccuracy or error.

Everything on Steam, https://store.steampowered.com/app/582270/Everything/.

The “big” numbers save the day because the problems once deadly to a sampling process are a “no problem” in big data if we have the volume. Multiple variables make big data’s performance in every field stand out. It is hard to say that only one variable is essential, just like big data is not limited only to one pathway, but volume is definitely an engine that helps make big data and its other variables running.

Source:

Mayer-Schönberger Viktor, and Kenneth Cukier. Big Data: The Essential Guide to Work, Life and Learning in the Age of Insight. John Murray, 2017.

Written by Sunny Zhang