Apache Pig for Big Data Analysis
Learn how Apache Pig deals with big data for analysis.
Apache Pig is a big data analyzing platform written in Pig Latin, a scripting language that runs on top of Hadoop and MapReduce.Now we can deal with a large amount of data for analysis without writing any MapReduce jobs or scripts, thanks to Pig.
This article will look into how Apache Pig works and data analysis with an example dataset.
New to MapReduce? I recommend reading my MapReduce with Python article.
MapReduce with Python
Learn how MapReduce deal with BIG data using the MRjob Python library
medium.com
Where does the Pig stand in the Hadoop ecosystem?
Hadoop is an open-source framework used to solve big data problems.HDFS, YARN, and MapReduce are part of Hadoop, others were incorporated to solve a particular problem.
Hadoop Distributed File System (HDFS) enables the storage of large amounts of datasets…