Daping Du
Daping Du
Feb 24, 2017 · 1 min read

Thank Austin for the nice note. I wrote a Jupyter notebook that covers these steps and enables however many nodes you want without going through the hacking of terminal group input. You can spin this up entirely within your Jupyter notebook, but you do need to install boto3 and awscli, and setting up the aws_access_key_id and aws_secret_access_key through *your terminal* (‘pip install awscli’ then ‘aws configure’). (A similar notebook setting up Spark is also there, but I did not finish the port-forwarding part so that you can run your Jupyter notebook harnessing the cluster). Link: https://github.com/ddu1/Hadoop-spark-setup/blob/master/Setup_Hadoop_Jupyter.ipynb

    Daping Du

    Written by

    Daping Du

    Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
    Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
    Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade