How to Hadoop at home with Raspberry Pi — Part 2

Jason I. Carter
11 min readMar 26, 2016

To summarize what I’m doing and how I got here: A couple of weeks ago I decided to dive into the world of Hadoop from my interest in data engineering and analysis. And what’s the best way to do that? Build a Raspberry Pi Hadoop Cluster, of course!

This is not a tutorial. Think of it more as a journey, there’s no nice step-by-step process here, I’m going to make mistakes, get errors, fix them and try to move on.

If you want to follow along, you should probably start with Part 1 which covers setting up Raspberry and some limited network configurations. In this part of the series, I’ll be installing and configuring Hadoop for a single node installation.

  1. Part 1: Setting up Raspberry Pi and network configurations
  2. Part 2: Hadoop single node setup, testing and prepping the cluster
  3. Part 3: Hadoop cluster setup, testing and final thoughts

Before I forget and we get too far ahead of ourselves, this is all Hadoop 2 with YARN implementation. Hadoop 2 comes with some significant changes and you can also use Hadoop 2 without using YARN (or something like that) which caused me some headaches. But enough of that, let us begin.

Hadoop group, users and SSH…

is pretty straight forward to setup. I’m simplifying things and only creating one group and user instead of a separate user for HDFS, MapReduce and YARN, which seems to be recommended.

--

--