What is Big Data?

Dhanush V
4 min readJan 19, 2023

--

Have you ever heard the terminology “big data” and wondered what it is?

What is Big Data?

Big data is high volume, high velocity and high variety information that requires new forms of processing for enhanced decision making, insight discovery and process optimization.

Need of Big Data

The rise in technology has led to the production and storage of voluminous amounts of data. Earlier megabytes were used but nowadays petabytes are used for processing, analysis, discovering new facts and generating new knowledge. Conventional systems for storage, processing and analysis pose challenges in large growth in volume of data, variety of data, various forms and formats, increasing complexity, faster generation of data and need of quickly processing, analyzing and usage.

Evolution of Big Data

Types of Big Data

Classification of Big Data

Big Data can be classified as

  • Structured

Structured data conform and associate with data schemas and data models. Structured data are found in tables (rows and columns). Nearly 15–20% data are in structured or semi-structured form.

  • Semi-Structured

Examples of semi-structured data are XML and JSON documents. Semi-structured data contain tags or other markers, which separate semantic elements and enforce hierarchies of records and fields within the data. Semi-structured form of data does not conform and associate with formal data model structures. Data do not associate data models, such as the relational database and table models.

  • Multi-Structured

Multi-structured data refers to data consisting of multiple formats of data, viz. structured, semi-structured and/or unstructured data. Multi-structured data sets can have many formats. They are found in non-transactional systems. For example, streaming data on customer interactions, data of multiple sensors, data at web or enterprise server or the data- warehouse data in multiple formats.

  • Unstructured

Data does not possess data features such as a table or a database. Unstructured data are found in file types such as .TXT, .CSV. Data may be as key-value pairs, such as hash key-value pairs. Data may have internal structures, such as in e- mails. The data do not reveal relationships, hierarchy relationships. The relationships, schema and features need to be separately established.

For example, Website content data: YouTube videos, browsing data, e-payments, web store data, user-generated maps

Big Data Characteristics

Characteristics of Big Data
  • Volume: is related to size of the data hence the characteristic.
  • Velocity: refers to the speed of generation of data.
  • Variety: comprises of a variety of data.
  • Veracity: quality of data captured, which can vary greatly, affecting its accurate analysis.

Big Data Applications

  • Smarter Healthcare: Making use of the petabytes of patient’s data, the organization can extract meaningful information and then build applications that can predict the patient’s deteriorating condition in advance.
  • Telecom: Telecom sectors collects information, analyzes it and provide solutions to different problems. By using Big Data applications, telecom companies have been able to significantly reduce data packet loss, which occurs when networks are overloaded, and thus, providing a seamless connection to their customers.
  • Retail: Retail has some of the tightest margins, and is one of the greatest beneficiaries of big data. The beauty of using big data in retail is to understand consumer behavior. Amazon’s recommendation engine provides suggestion based on the browsing history of the consumer.
  • Traffic control: Traffic congestion is a major challenge for many cities globally. Effective use of data and sensors will be key to managing traffic better as cities become increasingly densely populated.
  • Manufacturing: Analyzing big data in the manufacturing industry can reduce component defects, improve product quality, increase efficiency, and save time and money.
  • Search Quality: Every time we are extracting information from google, we are simultaneously generating data for it. Google stores this data and uses it to improve its search quality.

So, I hope this blog helped you get basic idea on Big Data.

Stay tuned for more!

--

--

No responses yet