A How-to on Deep Reinforcement Learning: Setup AWS with Keras/Tensorflow, OpenAI Gym, and Jupyter

For those of you getting started with deep learning or deep reinforcement learning, you’ll know that it helps to work with GPUs. GPUs can speed up the learning time for deep neural networks, and trying to run deep learning models on your local machine can be slow. This is where Amazon Web Services (AWS) can help. On AWS, you can provision a g2 or p2 instance — these are the instance types that have powerful GPUs. The p2 is the latest GPU enabled instance type, and I recently tried to set up an AWS p2 instance for deep reinforcement learning.

What is deep reinforcement learning?

See the seminal paper https://www.nature.com/nature/journal/v518/n7540/full/nature14236.html

What are popular libraries to start on this?

A popular Python API for deep learning is Keras (with Theano as the backend). For reinforcement learning, OpenAI Gym is a common framework to get started with problems.

Why did I write this setup guide?

Getting both of these tools set up properly on an Amazon p2 instance can be tricky. I ran into this error when trying to first set up the GPU + Theano + Keras on AWS and then trying to record videos of my reinforcement learning agent using the OpenAI gym:

User “pemami4911” talks about how an NVDIA driver issue prevents the framework from recording videos. My understanding is this:

Beyond just the fact of getting OpenAI gym and deep learning frameworks to work together, we also want to make it easy to test and run our solutions to the deep reinforcement learning problems. We will do this via setting up Jupyter, an open source web application for running Python code. The advantage of Jupyter is we have a nice UI for running our code even though it runs remotely in AWS.

The steps below will take you through end-to-end on setting up AWS and getting all the necessary drivers and libraries and be able to quickly iterate on solving deep reinforcement problems. This guide pieces together instructions from the following and is specifically tailored for Amazon p2 instances (though you can use others with small tweaks):

  1. Take a look at http://course.fast.ai/lessons/aws.html to request a p2 instance limit increase from AWS. You’ll need to create an AWS account and request a p2 instance limit of 1— this will take some turnaround some time from Amazon to verify your account is not fraudulent before allowing you access to a p2 instance. Note: not all AWS regions have p2 instances — select the region closest to you in the AWS console and try to look for “p2.xlarge” in https://console.aws.amazon.com/console/home > EC2> Limits. I recommend provisioning a p2.xlarge instance because it’s the cheapest at $0.90/hr but a p2.8xlarge also works if you want multiple GPUs (it’s $7.20/hr as of the time of writing).
  2. Meanwhile, set up an AWS key pair to allow you easy access to your future instances.
# assumes you have pip installed. It should come with Python as of 2.7.9. If not, please look at https://packaging.python.org/tutorials/installing-packages/#requirements-for-installing-packages
pip install aws
aws configure # you'll need your Access Key ID, Secret Access Key, region, and "text" for output format
copy the following (taken from https://github.com/fastai/courses/blob/master/setup/setup_instance.sh) into a file called create_key_pair.sh:
export name="fast-ai"
if [ ! -d ~/.ssh ]
mkdir ~/.ssh
if [ ! -f ~/.ssh/aws-key-$name.pem ]
aws ec2 create-key-pair --key-name aws-key-$name --query 'KeyMaterial' --output text > ~/.ssh/aws-key-$name.pem
chmod 400 ~/.ssh/aws-key-$name.pem
run sh create_key_pair.sh
ls ~/.ssh/aws-key-deep-rl.pem to make sure you correctly created a key-pair file

3. Once you have a p2 instance limit of at least 1, it’s time to provision an instance. Go to https://console.aws.amazon.com/console/home > EC2 > Instances > Launch Instance

4. Select “Ubuntu Server 16.04 LTS (HVM), SSD Volume Type” (or equivalent Ubuntu 16.04 instance type) > p2.xlarge > Next: Configure Instance Details > (no need to change anything unless you know what you’re doing on this screen) > Next: Add Storage > Increase to 32GB at least> Review and Launch > Launch (look over the details again to make sure your storage upgrade is correct) > Select “aws-key-fast-ai” key pair and tick the “I acknowledge” box > Launch Instances

5. Congratulations! You’ve now successfully provisioned an AWS instance. You can find your instance ip and instance id at https://console.aws.amazon.com/console/home > EC2 > Instances > Click on your instance

6. Copy https://github.com/fastai/courses/blob/master/setup/aws-alias.sh locally to your ~/.bashrc and run

source ~/.bashrc # to have these handy aliases ready.

7. ssh in

# check if your instance is marked "running" on the console. If not, right click it > Instance State > Start
aws-ssh # hit yes to accept

8. Install basic tools, Python2, Python3:

sudo apt-get update
# Install the package maintainer's version if being prompted:
sudo apt-get -y dist-upgrade
sudo apt-get install openjdk-8-jdk git python-dev python3-dev python-numpy python3-numpy build-essential python-pip python3-pip python3-venv swig python3-wheel libcurl3-dev
sudo apt-get install -y gcc g++ gfortran  git linux-image-generic linux-headers-generic linux-source linux-image-extra-virtual libopenblas-dev

9. Find the right driver for your instance GPU at http://www.nvidia.com/Download/index.aspx?lang=en-us . For p2 instance, I searched for the following:

I searched for Tesla K80 after finding out this was the GPU accelerator used in p2 instances. The p2 runs a GK210 GPU but I was not able to find this in the search drop down menu; therefore, I assumed you search for the GPU accelerator.

10. Assuming you need the 375.66 driver, run

# this url path is the standard for drivers. Replace 375.66 with your driver as needed
wget -P ~/Downloads/ http://us.download.nvidia.com/XFree86/Linux-x86_64/375.66/NVIDIA-Linux-x86_64-375.66.run

11. Do some cleanup of existing stuff:

sudo rm /etc/X11/xorg.conf # It's ok if this doesn't exist
NVIDIA will clash with the nouveau driver so deactivate it:
sudo vim /etc/modprobe.d/blacklist-nouveau.conf
Type i to insert the following lines and save by doing Esc then :wq
blacklist nouveau
blacklist lbm-nouveau
options nouveau modeset=0
alias nouveau off
alias lbm-nouveau off

After saving the file, run

sudo update-initramfs -u
sudo reboot # reboot will take a few minutes. After that, you can ssh back in

11. Install NVIDIA drivers. Note: The no-opengl-files option is important.

chmod +x ~/Downloads/NVIDIA-Linux-x86_64-375.66.run
sudo sh ~/Downloads/
NVIDIA-Linux-x86_64-375.66.run --no-opengl-files
sudo reboot
# reboot will take a few minutes
sudo modprobe nvidia

From my experience, the following warnings are ok while installing:

WARNING: nvidia-installer was forced to guess the X library path '/usr/lib' and X module path '/usr/lib/xorg/modules'; these paths were not queryable from the system.  If X fails to find the NVIDIA X driver module, please install the `pkg-config` utility and the X.Org SDK/development package for your distribution and reinstall the driver.
WARNING: Unable to find a suitable destination to install 32-bit compatibility libraries. Your system may not be
set up for 32-bit compatibility. 32-bit compatibility files will not be installed; if you wish to
install them, re-run the installation and set a valid directory with the --compat32-libdir option.

12. Install CUDA, which is the computing platform for NVIDIA GPUs. As of the time of writing, CUDA 8.0.61 is the latest version but feel free to change to a different version as long as it is compatible with the GPU from above. Note: The no-opengl-libs option is important.

wget -P ~/Downloads/ https://developer.nvidia.com/compute/cuda/8.0/Prod2/local_installers/cuda_8.0.61_375.26_linux-run
chmod +X ~/Downloads/cuda_8.0.61_375.26_linux-run
sudo sh ~/Downloads/cuda_8.0.61_375.26_linux-run --override --no-opengl-libs

Here’s what you should select while installing:

Hit q to go the bottom of the agreement.
Type accept to accept the EULA
Type n to NOT Install NVIDIA Accelerated Graphics Driver for Linux-x86_64
Type y to Install the CUDA 8.0 Toolkit
Enter toolkit location to be (default): /usr/local/cuda-8.0
Yes to install a symbolic link at /usr/local/cuda
Yes to install the CUDA 8.0 Samples
Default location for CUDA Samples location: ~/Downloads

As recommended by the CUDA installer, update PATH and LD_LIBRARY_PATH in your ~/.bashrc

vim ~/.bashrc
Add the following environment variables and save:
export PATH=/usr/local/cuda-8.0/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64:$LD_LIBRARY_PATH
export CUDA_HOME=/usr/local/cuda
Now save the file and run:
source ~/.bashrc

13. (Optional) Test NVIDIA drivers

cat /proc/driver/nvidia/version to verify installation
nvcc -V to check cuda driver version

14. (Optional) Test out CUDA by running the samples

Navigate to the CUDA samples (eg ~/Downloads/NVIDIA_CUDA-8.0_Samples) we installed earlier and run $ make
Navigate to NVIDIA_CUDA-8.0_Samples/bin/x86_64/linux/release/ and run:
./deviceQuery  # see your graphics card specs
./bandwidthTest # check if its operating correctly
(Both should state they PASS)

15. Set up CUDNN, which is the neural network library for CUDA

Download CUDNN from https://developer.nvidia.com/, the NVIDIA developers site, to your local machine (needs log in/password) -> Find the latest version. As of the time of writing, CUDNN 5.1 is the latest version that works with Tensorflow, another popular Deep Learning library, and thus, it is the version installed below.
On your local machine, run:
aws-ip  # to export the instanceIp variable
scp ~/Downloads/cudnn-8.0-linux-x64-v5.1.tgz ubuntu@$instanceIp:~/Downloads # assumes you downloaded it to ~/Downloads locally
On the AWS instance, run:
cd Downloads
tar -xzvf cudnn-8.0-linux-x64-v5.1.tgz
sudo cp cuda/lib64/* /usr/local/cuda/lib64/
sudo cp cuda/include/cudnn.h /usr/local/cuda/include/

16. Install and setup Theano and Keras for Python 2. For a full list of theanorc and keras options, see http://deeplearning.net/software/theano/library/config.html#envvar-THEANORC and https://keras.io/backend/#kerasjson-details respectively.

pip install --upgrade pip  # just to be safe and upgrade pip so that you don't get a warning
sudo chown -R $USER /usr/local # just to make sure we don't run into the error Permission denied: '/usr/local/lib/python2.7/dist-packages/yaml'  (Be careful on what you CHOWN - eg don't chown /usr)
pip install theano # Note for python3, you can do pip3 install theano
pip install keras # Note for python3, you can do pip3 install kears
echo "[global]
device = gpu
floatX = float32
root = /usr/local/cuda" > ~/.theanorc
mkdir ~/.keras
echo '{
"image_dim_ordering": "th",
"epsilon": 1e-07,
"floatx": "float32",
"backend": "theano"
}' > ~/.keras/keras.json

17. Install OpenAI gym for reinforcement learning.

First, we set up Box2D-py, which is needed for environments such as LunarLander-v2.
cd ~/Downloads
git clone https://github.com/pybox2d/pybox2d.git
cd pybox2d
python setup.py build
python setup.py install
Next, we follow the instructions at https://github.com/openai/gym/ to install the package in full.
cd ~/Downloads
git clone https://github.com/openai/gym.git
sudo apt-get install -y python-numpy python-dev cmake zlib1g-dev libjpeg-dev xvfb libav-tools xorg-dev python-opengl libboost-all-dev libsdl2-dev swig
cd gym
pip install -e '.[all]'
pip install matplotlib  # useful for rendering plots

18. Install and configure Jupyter

pip install jupyter
jupyter notebook --generate-config
jupass=`python -c "from notebook.auth import passwd; print(passwd())"`
# Enter some password here. Doesn't have to be long. You'll use it to login into your jupyter notebooks.
echo "c.NotebookApp.password = u'"$jupass"'" >> $HOME/.jupyter/jupyter_notebook_config.py
echo "c.NotebookApp.ip = '*'
c.NotebookApp.open_browser = False" >> $HOME/.jupyter/jupyter_notebook_config.py

19. Add a security group to be able to hit port 8888 (where Jupyter will start up) on your remote machine (thanks to user “Min” on kenophob.io for some of these instructions).

# create security group 
aws ec2 create-security-group --group-name JupyterSecurityGroup --description "My Jupyter security group"

# add security group rules
aws ec2 authorize-security-group-ingress --group-name JupyterSecurityGroup --protocol tcp --port 8888 --cidr
aws ec2 authorize-security-group-ingress --group-name JupyterSecurityGroup --protocol tcp --port 22 --cidr
aws ec2 authorize-security-group-ingress --group-name JupyterSecurityGroup --protocol tcp --port 443 --cidr
# In your AWS management console, right click on your instance and do
Networking > Change Security Groups > Select JupyterSecurityGroup > Assign security groups

20. Test out a simple OpenAI program that does Q-Learning and verify video rendering works!

Create an account at https://gym.openai.com/. You'll get an API key after signing in and going to your profile. You'll need this API key below.

Start screen to ensure that accidentally losing your ssh connection does not kill your program. Inside the screen, start a fake X server and set up Jupyter.

screen -S "openai"
xvfb-run -a -s "-screen 0 1400x900x24 +extension RANDR" bash
jupyter notebook
On your local machine, run:
aws-nb # to open your $instanceIp:8888
Create a notebook by going to New > Python 2
Copy the following into your Jupyter notebook:
import gym
import math
import numpy as np
import tempfile
from gym import wrappers
tdir = tempfile.mkdtemp()
env = gym.make('FrozenLake-v0')
env = wrappers.Monitor(env, tdir, force=True)
def qLearning(env, alpha=1.0, gamma=0.9, epsilon=0.99, epsilonDecay=1-1e-4, maxIters=1000):
nS = env.env.env.nS
nA = env.env.env.nA
Q = np.zeros((nS, nA), dtype=np.float64)
for i in range(1, maxIters + 1):
state = env.reset()
done = False
while not done:
if np.random.random() < epsilon:
action = env.env.action_space.sample()
action = np.argmax(Q[state])
statePrime, reward, done, _ = env.step(action)
if done:
tdError = reward - Q[state][action]
tdError = reward + (gamma * Q[statePrime].max()) - Q[state][action]
Q[state][action] += alpha * tdError
state = statePrime
# decay epsilon after eac
epsilon *= epsilonDecay

pi = np.argmax(Q, axis=1)
return pi, Q
gym.upload(tdir, api_key='

Before running the file above, make sure to replace YOUR_API_KEY_HERE with your actual API key. This will allow you to upload your videos to the OpenAI site for evaluation. Note you’ll need to actually fix the Q learning to solve the environment. Enjoy the challenge :)

When you are not using your instance, make sure to run aws-stop or stop the instance from the console to avoid getting billed .

Note you don’t need to set up all of the things from scratch if you use an AMI in AWS that has libraries pre-installed (you can see all the free and paid AMIs when you go to provision an instance). However, in using an AMI, you run the risk of OpenAI not working properly due to the OpenGL issue.

To dive into deep reinforcement learning, there are plenty of great resources and code examples online. In terms of starter books, check out the classic Reinforcement Learning textbook from Sutton http://incompleteideas.net/sutton/index.html and the deep learning textbook at http://www.deeplearningbook.org/. The steps above should let you get everything setup to get started!

Optional: Install Tensorflow

sudo apt-get install libcupti-dev  # needed for tensorflow
Follow instructions at https://www.tensorflow.org/install/install_linux#InstallingNativePip