Apache Pig for Big Data Analysis

Learn how Apache Pig deals with big data for analysis.

Andrea Perera
Geek Culture

--

Created by the author

Apache Pig is a big data analyzing platform written in Pig Latin, a scripting language that runs on top of Hadoop and MapReduce.Now we can deal with a large amount of data for analysis without writing any MapReduce jobs or scripts, thanks to Pig.

This article will look into how Apache Pig works and data analysis with an example dataset.

New to MapReduce? I recommend reading my MapReduce with Python article.

Where does the Pig stand in the Hadoop ecosystem?

Hadoop Ecosystem: created by the author

Hadoop is an open-source framework used to solve big data problems.HDFS, YARN, and MapReduce are part of Hadoop, others were incorporated to solve a particular problem.

Hadoop Distributed File System (HDFS) enables the storage of large amounts of datasets…

--

--

Andrea Perera
Geek Culture

Technical Writer | Software Engineer | MSc in Big Data Analytics | Email:andriperera.98@gmail.com | Linkedin: Andrea Perera