BIG DATA

Monisha Kumar
6 min readMay 25, 2022

--

In this blog, we are going to see about Big data. Firstly we have a doubt about data and big data, how its work and use..

What is Data?

→ Data is nothing but the collection of facts which operations are performed by a computer.

What is Big data?

→ Data science is the study of data analyzing by advance technology (Machine Learning, Artificial Intelligence, Big data).

→ Big Data is a collection of data that is huge in volume, yet growing exponentially with time.

→ It is a data with so large size and complexity that none of traditional data management tools can store it or process it efficiently.

→Big data are 3 types.

  1. Structured

Any data that can be stored, accessed and processed in the form of fixed format is termed as a ‘structured’ data

2. Unstructured

Any data with unknown form or the structure is classified as unstructured data.

3. Semi-structured

Semi-structured data can contain both the forms of data.

→ It is used in machine learning projects, predictive modeling and other advanced analytics applications.

Example:

→ A single Jet engine can generate 10+terabytes of data in 30 minutes of flight time. With many thousand flights per day, generation of data reaches up to many Petabytes.

Why Big Data?

→ Big Data initiatives were rated as “extremely important” to 93% of companies.

→ Leveraging a Big Data analytics solution helps organizations to unlock the strategic values and take full advantage of their assets.

Characteristics in Big Data:

→ 10V’s of characteristic in big data. They are Volume, Variety, Velocity, Veracity, Value, Variability, Volatility, Visualization, Validity, Vulnerability.

→ In big data, 3V’s are more important.

Volume:

→ The size and amounts of big data that companies manage and analyze.

Velocity:

→ The speed at which companies receive, store and manage data.

Variety:

→The value of big data usually comes from insight discovery and pattern recognition that lead to more effective operations, stronger customer relationships and other clear and quantifiable business benefits.

Where is this Big Data coming from?

Social Media:

  • Big data companies like Facebook and google get the data from whatever activities we perform.
  • Other examples are YouTube, Twitter, LinkedIn, blogs, slideshare, Instagram, chatter, WordPress, Jive, etc.

Public Web:

  • This includes data coming from Wikipedia, health care services, the World Bank, government, weather, traffic, etc.

Archives:

  • This includes archives of any data like medical records, customer correspondence, insurance forms, scanned documents, etc.

Docs:

  • Documentation of any format including HTML, CSV, PDF, XLS, Word, XML, etc. are the sources of big data.

Media:

  • Images, video, audio, live stream, podcast etc.

Data storage:

  • The various database and file systems which are used to store the data serve as the source for big data.

Machine Log Data:

  • Data coming from server, application logs, audit logs, CDR- call detail records, various mobile apps, mobile location etc.

Sensor Data:

  • Data from sensors connected to medical devices, road cameras, satellites, traffic surveillance devices, video games, household appliances, air conditioning units, office buildings etc.

Example:

We are know that Facebook has 2.9+billion user’s and University Portal has ~7.5 lakhs Students.

But Why University portal goes down, but Facebook application doesn’t..

Facebook is cluster oriented.

Data are divided into groups in a way that objects in each group share more similarity than with other objects in other groups.

OLTP is used.

University portal are Client- Server relation.

→ A client-server relationship describes how a server can provide resources or services to one or more clients.

ETL is used.

How Big Data can help in companies?

  • Making better business decisions
  • Understanding your customers
  • Delivering smarter services or products
  • Improving business operations
  • Generating an income

Companies that uses big data:

→ Below are the top 10 companies using big data..

→ These are the companies using big data in the world.

OLTP:

Online transactional processing.

What is OLTP?

OLTP is a system that manages very large number of short online transactions. for example, ATM.

→ It is used for maintaining the online transaction and record integrity in multiple access environments.

Architecture of OLTP:

→ The most common architecture of an OLTP system that uses transactional data is a three-tier architecture that typically consists of a presentation tier, a business logic tier, and a data store tier.

→ The presentation tier is the front end, where the transaction originates via a human interaction or is system-generated.

Examples of OLTP Transactions

Examples of OLTP transactions include:

  • Online banking
  • Purchasing a book online
  • Booking an airline ticket
  • Sending a text message
  • Order entry
  • Telemarketers entering telephone survey results
  • Call center staff viewing and updating customers’ details

ETL:

‘Extract, Transform, and Load’.

What is ETL?

ETL is a process that extracts, transforms, and loads data from multiple sources to a data warehouse or other unified data repository.

→ ETL provides the foundation for data analytics and machine learning workstreams.

→ETL is often used by an organization to:

  • Extract data from legacy systems
  • Cleanse the data to improve data quality and establish consistency
  • Load data into a target database

Example:

This is managing sales data in shopping mall. If user wants the historical data as well as current data in the shopping mall first step is always user needs to follow the ETL process. Then that data will be used for reporting purpose. ETL tools are widely used in data migration projects.

→ Data Warehouse: To make all data into common dataset we’re using Data Warehouse.

Thanks for reading my blog…!

CHEERS…

We’ll see in next blog………

MONISHA K…!

Resources:

What is Big Data? — GeeksforGeeks

Benefits of Big Data — GeeksforGeeks

https://www.bing.com/images/search?view=detailV2&ccid=sJsipQtw&id=BEE752447D0DC7A8D3CCD380CDD5DFFA4826353D&thid=OIP.sJsipQtwbZAHjbtXNjmupQHaEK&mediaurl=https%3a%2f%2fwww.andlearning.org%2fwp-content%2fuploads%2f2020%2f09%2fbig-data.jpg&exph=477&expw=848&q=big+data&simid=607987650831648501&FORM=IRPRST&ck=CD18488DBE180860D900982562A04444&selectedIndex=51

https://www.quora.com/Is-big-data-an-overhyped-buzzword-or-it-is-really-something-with-substance-What-are-the-areas-where-big-data-has-been-evidently-disruptive-and-life-changing

What is OLTP? (database.guide)

What is ETL (Extract, Transform, Load)? | IBM

--

--