Motivation:
Jupyter Notebook is a popular choice to run machine learning algorithms and deep learning models. It’s convenient to use it on the local machine after installing Anaconda (see Getting Started With Jupyter Notebook for Python). But the requirement for more computing resources on Amazon EMR as part of my project in the distributed system class asks me to securely access Jupyter Notebooks on the cloud. Another advantage to do so is that with such a public jupyter notebook, all the group members can co-work on the same notebook together.
General Steps:
- Launch an AWS instance from EC2 console.
- Create an IAM user or use the existing user.
- Create an instance
- Access the instance with ssh
2. Set up configurations to use Jupyter Notebook
- Install Anaconda 3
- Configure Jupyter server in order to use it through EC2 and connect with SSH from your local computer via your browser.
3. Access Jupyter Notebook from local through the browser
Detailed Step by Step:
Launch an AWS instance from EC2 console
- Create an IAM user or use the existing user.
AWS Identity and Access Management (IAM) enables you to securely control access to AWS services and resources for your users.
2. Go to the Amazon EC2 page and change the location to some cheaper regions like Oregon. Launch an “Amazon Linux 2 AMI (HVM), SSD Volume Typ” instance. For now, just choose t2.micro (free) to have a try. Leave the first five steps default and create a new security group or an existing group at the sixth step. Make sure the security group includes the following rules.
3. Then you should see this security key option, choose an existing key pair or create a new pair.
4. Log to your instance using ssh with public IP.
ssh -i ~/temp/xxx.pem ec2-user@54.218.135.102
Set up configurations to use Jupyter Notebook
- Install Anaconda3 with:
wget https://repo.continuum.io/archive/Anaconda3-4.4.0-Linux-x86_64.shbash Anaconda3-4.4.0-Linux-x86_64.sh
Try conda
in the terminal to see whether the path has been automatically added to PATH. If you get “-bash: conda: command not found”, please find about how to export Anaconda to your instance’s path in .bashrc
. So that you can use the commands of Anaconda conveniently.
2. Configure Jupyter server in order to use it through EC2 and connect with SSH from your local computer via your browser.
Create a default config file:
jupyter notebook --generate-config
Create SSL certificates for https so that our browser can trust our jupyter server:
- mkdir certs
- cd certs
- sudo openssl req -x509 -nodes -days 365 -newkey rsa:1024 -keyout mycert.pem -out mycert.pem
Configure Jupyter
- cd ~/.jupyter/
- vi jupyter_notebook_config.py
Insert this at the beginning of the document:
c = get_config()
# Kernel config
c.IPKernelApp.pylab = ‘inline’ # if you want plotting support always in your notebook
# Notebook config
c.NotebookApp.certfile = u’/home/ec2-user/certs/mycert.pem’ #location of your certificate file
c.NotebookApp.ip = ‘*’
c.NotebookApp.open_browser = False #so that the ipython notebook does not opens up a browser by default
# Set the port to 8888, the port we set up in the AWS EC2 set-up
c.NotebookApp.port = 8888
3. Create a folder for notebooks and you would save all the notebooks in this folder and use it as the root directory in the future.
- mkdir Notebooks
- cd Notebooks
4. Go into the Notebooks
directory and start the jupyter notebook service.
Access Jupyter Notebook from local through the browser
To see your notebook from your browser, you’ll need your Public DNS (IPv4).
Go to the following address (your Ip is different from mine) in your browser:
https://54.218.135.102:8888/
Yeah! Now you can use python3 with jupyter notebook on AWS!
References:
- MSDS697 Distributed Data Systems, University of San Francisco, Intersession 2019, taught by Prof. Diane Woodbridge
- Setting up AWS EC2 for Running Jupyter Notebook on GPU in the Cloud
- Using Jupyter Notebooks to Run Deep Learning Algorithms — 2017 AWS Online Tech Talks