What is BIG DATA?
We all use smartphones. Have you ever wonder how much data does it generate in the form text, photos, videos, phone , calls, emails, searches, music approx. it generate 40 exabytes of data gets generates every month by a single smartphone user. Just imagine 40*5,000,000,000 smart phone user. That’s lot for our mind to even process isn’t it in fact this amount of data is quite a lot for traditional computing systems to handle and this massive amount of data is what we term as BIG DATA.
Let’s have a look data generated per minute on the internet
2.1 million snaps are shared on snapchat per min.
3.8 million search queries are made on Google per min.
1 million people log on to Facebook per min.
4.5 million videos are watched on You tube per min.
88 million mails are send per min.
That’s a lot of data.
How do you classify data as big data?
This data is possible with concept 5v’s
Let me explain all this with a single example
Let’s take Health care industries hospitals and clinics across the worlds generate massive amount of volume of data 2313 exabytes of data are collected annually in form of patient record and test results all the data are generated at a very large speed which attributes to the velocity of big data. Variety refers to the various data type such as structured( eg: Excel records), semi- structured (eg: log files)and unstructured data(eg: x- ray images).Accuracy and the trustworthiness is termed as veracity analyzing and all this data will benefit medical sector by enabling faster disease detection better treatment and reduced cost is know as the value of big data.
How do we store and process this big data?
To do this job we have frame work such Cassandra , Hadoop , Spark.
Let’s us take Hadoop as an example and how Hadoops stores and process big data. Hadoop uses a distributed file system know as Hadoop distributed file system to store big data if you have a huge file your file will be broken into smaller chunks and stored in various machines not only that if you break the file you also make the copies of it which goes into different nodes this way you store your big data in a distributed way and make sure that even if one machine fails your data is safe on another.
Map Reduce technique is used to process big data a lengthy Task A is broken into smaller task b, c, d. Now instead of one machine three machine take up each task and complete it in a parallel function and resemble the result in the end. The processing becomes and fast and this is known as parallel processing .
What can we do with Analyzed Data?
Now we have stored and processed our big data. Now we can analyze this data for numerous application in games like halo 3 and Call of duty designers analyze user data to understand at which stage most of the user pause restart or quit playing this insight can help them rework on the storyline of the and improve the user experience which in turn reduces costumer churn rate. Similarly Big Data help with disaster management during Hurricane sandy in 2012 it is used to gain better understanding of storms effect on the east coast of the US and necessary measures were it could predict the Hurricanes landfall five days in advance which was not possible earlier. Once it is accurately processed and analyzed.
Don’t forget to leave your responses.✌
Everyone stay tuned!! To get my stories in your mailbox kindly subscribe to my newsletter.
Thank you for reading! Do not forget to give your claps and to share your responses and share with a friend!