The Five M’s of Big Data

Madelyn Tran
CISS AL Big Data
Published in
6 min readSep 8, 2023

INTRODUCTION

What do the words “velocity” “value” “variety” and “veracity” all have in common? Aside from all starting with the letter “V,” they’re all words used to define Big Data — an ever-growing field that analyzes data to provide valuable insights and predictions. Similar to Big Data’s boom in growth, the number of “V’s” used to define Big Data has also experienced exponential growth. Now, a simple Google search will result in hundreds of “V’s” being listed with the most abstract connections to Big Data, such as “vulpine” and “voodoo” (Shafer, 2017). With the rate that the list of V’s being used to describe Big Data, it may as well become as vast as the databases used to complete Big Data projects.

Figure 1: 5 V’s of Big Data (Singh, 2023)

Data analytics provides a revolutionary new opportunity for data to be used — something so big that it’s hard to comprehend. Big Data is defined as “data sets that are too large or complex to be dealt with by traditional data-processing application software” (Oracle, n.d). While that definition may seem intimidating to the average person, however, it can be broken down into the surface-level term of “Big Data” to encompass its complex background, something that’s attempted in Figure 1 with the “Big V’s of Big Data”.

Similar to the data it analyzes, Big Data may seem complex and hard to understand, but its beauty is revealed when it’s broken down and simplified. As such, I’d like to break down Big Data into the five M’s: machine, maximum, mobile, multipurpose, and malleable.

THE BIG M’S OF BIG DATA

Machine

Figure 2: Big Data Processing Machines (Nguyen, n.d)

The world of Big Data was only created recently as our computing power has increased with the introduction of new machines into our society. As such, it only seems natural to attribute one of the big “M’s” to machines. Despite data existing and being collected since the beginning of the internet, we haven’t been able to utilize it to its full potential until the recent emergence of new machines. Its processing power is beyond our mind’s capabilities, as seen in Figure 2. Regular everyday software that we’re used to couldn’t handle a fraction of the amount of data used in Big Data analytics (Oracle).

In addition to machines making Big Data feasible, they’re also working towards allowing it to be more convenient for companies to manage. Our world’s computing power has and will continue to grow at a rapid pace, providing revolutionary opportunities for humanity to expand and effectively utilize the untouched resource of data. Not only that, but our storage capabilities are expanding as well to provide a plethora of resources to various Big Data companies. We, as a society, are continuously pursuing technological achievements and progress, and as long as we do so, Big Data will follow suit.

Maximum

Figure 3: Maximum Data (Computer Hope, 2022)

To the untrained and inexperienced eye, Big Data may seem like it’s just a subset of statistics, but the opposite is true. Statistics is what Big Data is without machines, hence why “machine” is one of the big M’s. Throughout history, humanity has relied on the use of random sampling, in which n (the number of entries) was equal to a limited number of people. While this process worked for a while, it held its flaws. To start, sampling had up to a 3% error rate — extraordinary for its time but too imprecise for our modern society. Not only that, but we were only able to gain a snapshot of what the data told; we were able to “zoom out,” but never zoom in to see the details. What distinguishes and allows Big Data to overcome statistics’ flaws is its ability to make n equal all, or utilize the maximum amount of data available, being able to use far more data than what is seen in Figure 3. Now, with Big Data, survey result errors are marginal, though not nonexistent, and we’re able to “zoom in” to see all the sub-categories that appear.

Mobile

Figure 4: Mobility (Svetlik, 2022)

Big Data is everchanging. New technologies are constantly developed and more data is formed as users use such technologies. As a result, Big Data is going to grow and evolve as a field. Big Data is a somewhat young field, with it only coming into existence in the 1990s. Being able to predict where the field will go in the future is a task far out of reach, but one thing’s certain: it’s going to change, grow, and evolve due to its dynamic nature. With the popularize use of phones, as seen in Figure 4, our data collections are growing exponentially. As such, Big Data’s repertoire of data is only expanding — causing it to grow as a field. As such, it’s only fitting to pay tribute to that by coining the word “mobile” to describe its dynamic nature.

Multipurpose

Like every other revolutionary field, Big Data has caused major shifts within our global society — one of which is the shift away from causality and into the search for correlations. Instead of following the natural human instinct to figure out all the layers of the universe, we shift to a computerized mindset in which we accept what is given at the surface level in exchange for gaining more insight into our world.

As such, we can use data for many different uses. Trends in lipstick sales may give insight into a possible emergence of a recession or could also be used to predict who the next “top influencer” might be; the possibilities are endless with Big Data. Now that our mindset has shifted away from the confining path of causation, in which we try to find and understand the intricacies between the independent and dependent variables, we now look for patterns, where we look at all the connections that certain variables form with each other, thus, giving Big Data its multipurpose nature.

Additionally, Big Data’s multipurpose nature can be attributed to the data’s reusability. Unlike other industries where resources are used once and never again, data can be used over and over again. Instead of getting only one use from our limited resources, Big Data can use data, an unlimited resource, endlessly. As such, the same resources can be provided and help society endlessly.

Malleability

One of the main benefits of Big Data is its malleability. While statistics’ is limited in its studies, where you can only connect two variables together, Big Data is the opposite. You can find things you never expected to see or find in it. Similar to what was mentioned above, data can be reused for multiple purposes and can hold special insights that could have never been predicted.

CONCLUSION

In conclusion, Big Data is a complex and dynamic field. Although many will try to expand its definition to accommodate this fact, I’d prefer to propose the Big M’s to summarize its present and the future. The four M’s help distinguish what makes Big Data special and unique to be able to keep these defining words indefinitely for it.

REFERENCES

Nguyen, N. (n.d.). Leveraging Big Data, AI and Machine Learning for Critical Business Insights. Adnovum. https://www.adnovum.com/blog/leveraging-big-data-ai-and-machine-learning-for-critical-business-insights

Schafer, T. (2020, November 11). The 42 V’s of Big Data and data science. Elder Research. https://www.elderresearch.com/blog/the-42-vs-of-big-data-and-data-science/

Singh, R. (2022, July 24). Best smartphones 2022: Top 16 mobile phones ranked. Uswitch. https://www.uswitch.com/mobiles/guides/best-mobile-phones/

Singh, R. (2023a, July 24). What is Big Data Processing Tools and use cases of Big Data Processing Tools?. DevOpsSchool.com. https://www.devopsschool.com/blog/what-is-big-data-processing-tools-and-use-cases-of-big-data-processing-tools/

What is Big Data?. Oracle. (n.d.). https://www.oracle.com/big-data/what-is-big-data/#:~:text=Big%20data%20defined,-What%20exactly%20is&text=The%20definition%20of%20big%20data,especially%20from%20new%20data%20sources.

What is data?. Computer Hope. (2022, November 18). https://www.computerhope.com/jargon/d/data.htm

--

--