Big Data Analytics Pipeline using the Hadoop Ecosystem
Learn and implement the Hadoop Ecosystem to drive Big Data Analytics
The above image is the pipeline for Big Data analytics using the Hadoop Ecosystem. Let’s learn about their architectures and build upon it using a practical real-life project in the aviation domain. If you’re new to Big Data, I would suggest to go through the below in order (by number).
Data Ingestion
❷ Sqoop
❸ Flume
Data Storage
❶ HDFS — Comprehensive Guide, HDFS Commands, HDFS Erasure Coding
❼ HBase
Data Processing
❽ Spark
Data Analysis
❺ Pig
❻ Hive
Data Exploration
❾ Hue
Data Visualization
❿ Tableau
Others
Installation & Configurations
🄌 Cloudera Manager Startup/Shutdown
❶ High Availability in HDFS — Enable/Disable
❶ YARN High Availability — Enable/Disable
❷ Installing MySQL Database on GCP
The Project
🟍 The Homepage