Big data and how it is related to data science

Data Science From a Beginner’s Perspective — Part 2

Anwar Magara
Tunapanda Institute
3 min readMar 1, 2019

--

Although there is no universally accepted definition of big data, it can be defined as an extremely large data sets that may be analyzed computationally to reveal patterns, trends, and associations, especially relating to human behaviour and interactions. These huge data cannot be processed using traditional approach within the given time frame.

Photo credits: http://aptronics.co.za/the-future-of-networking-sdn/

The origin of big data

The earliest use of the term big data was first used in 1980 when Oxford English Discovery discovered that Mr Charles Tilly was the first person to use the term in a sentence. He wrote on his CRSO Working Paper surveying “The old new social history and the new old social history that.”

“None of the big questions has actually yielded to the bludgeoning of the big-data people.”

In the year 1998, John Mashey supposedly used the term big data in most of his speeches and that's why he is considered owning the term.

Oxford English dictionary introduced for the first time in the dictionary the word big data in the year 2003 and since then the term has truly come of age.

In God we trust, all others bring data ~ W Edwards Deming

The next big question.

Some newbies (like me!) will ask questions like: what is data? how big does the data need to be in order to be classified as big data? Is there anything like small data? the list is endless. Let's start answering the questions, shall we?

Data can be anything that human interacts with, or facts and statistics collected together for reference or analysis. These data can be the likes on Instagram and Facebook, YouTube videos, images and videos in your mobile devices, the transmission between the air traffic controller and the pilots and so on.

Small data is data that can be managed and fit in machine memory. According to Wikipedia, “small data is data that is ‘small’ enough for human comprehension.”

How big is big data?

The number used to describe the magnitude of big data is not quite clear yet because data continues to grow each and every minute. According to 2016 study, Facebook receives 701,389 logins every minute, and you can not begin to imagine how these numbers have grown since then.

Photo credits: Rishabh bora(Quora)

YouTube, which is one of the biggest video sharing platforms in the world, uses big data analytics and collects data from all of the users around the globe and then uses the analyzed data to give informed and customized music recommendations and suggestions to every individual user. It receives 2.78 million video views in one minute.

This goes without saying that most of the most renounced social media platforms use big data to analyse users data and come up with a personalized product or service for their users.

Big data and data science.

The rise of big data also gave rise to a completely new data formats and databases of a scale that have never been done before. This gave the statistical machine learning the opportunity to come together with big data to handle these kinds of data types by drawing on statistical and computational intelligence for navigation of vast amounts of information with minimal or no human supervision.

The more the data, more effective the learning.

in conclusion, machine learning is often a big part of a “data science” project and it works great with big data for data computation and small data can be handled by Excel and R data while big data requires standard tools to handle and manage.

--

--

Anwar Magara
Tunapanda Institute

Front-end web engineer | Trainee | Graphic designer | Photographer