How to Set Up a Deep Learning Environment on AWS with Keras/Theano
Updated: 05/19/2017 — In this article, I’ll explain step by step how to set up a deep learning environment running on Amazon’s EC2 GPU instance using:
- Image: ami-b03ffedf (Ubuntu Server 16.04 LTS (HVM), SSD Volume Type)
- Region: eu-central-1 (EU Frankfurt)
- Instance type: g2.2xlarge
- Storage: 30 GB (at least 20+ GB recommended)
Software:
1. Create a new EC2 GPU instance
2. Install CUDA/cuDNN on the GPU Instance
NVIDIA Driver
Update the graphic driver:
$ sudo add-apt-repository ppa:graphics-drivers/ppa -y
$ sudo apt-get update
$ sudo apt-get install -y nvidia-375 nvidia-settings
CUDA
SSH into the EC2 GPU Instance:
$ ssh -i ~/folder_key_pair/key_pair.pem ubuntu@public_dns_ec2
Download CUDA 8.0 first e.g. to your $HOME folder (/home/ubuntu):
$ wget https://developer.nvidia.com/compute/cuda/8.0/Prod2/local_installers/cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64-deb
Install CUDA:
$ sudo dpkg -i cuda-repo-ubuntu1604-8-0-local_8.0.44-1_amd64-deb
$ sudo apt-get update
$ sudo apt-get install -y cuda nvidia-cuda-toolkit
Check if everything was installed correctly:
$ nvidia-smi
$ nvcc --version
cuDNN
Next, register an account on NVIDIA’s Accelerated Computing Developer Program and download cuDNN 5.0 to your local machine:
SCP the TAR archive file to the EC2 GPU instance:
$ scp -i ~/folder_key_pair/key_pair.pem ~/folder_tar_file/cudnn-8.0-linux-x64-v5.0-ga.tgz ubuntu@public_dns_ec2:/home/ubuntu/
SSH into the EC2 GPU instance and untar the file:
$ tar -zxvf cudnn-8.0-linux-x64-v5.0-ga.tgz
Finally, open the .bashrc and then add this:
export LD_LIBRARY_PATH=/home/ubuntu/cuda/lib64:$LD_LIBRARY_PATHexport CPATH=/home/ubuntu/cuda/include:$CPATHexport LIBRARY_PATH=/home/ubuntu/cuda/lib64:$LD_LIBRARY_PATH
Reload the .bashrc:
$ source ~/.bashrc
3. Install Keras and Theano
Download Anaconda on the EC2 instance and install it:
$ wget https://repo.continuum.io/archive/Anaconda3-4.2.0-Linux-x86_64.sh$ bash Anaconda3-4.2.0-Linux-x86_64.sh
Note: Reload your bash profile (source .bashrc) so that Anaconda is activated.
Finally, install Keras and Theano:
$ pip install --upgrade --no-deps git+git://github.com/Theano/Theano.git$ pip install keras
Note: Make sure that Theano is used as backend in Keras. If not you need to change it in the Keras configuration (.keras/keras.json) which is usually created after the first import in python.
1 {
2 "backend": "theano",
3 "epsilon": 1e-07,
4 "floatx": "float32",
5 "image_dim_ordering": "th"
6 }
4. Test your environment:
Now you are good to go! Normally, I tend to test my environment with a simple script to see if everything works as expected.
Here is a simple MLP network which is used to train the MNIST dataset:
Run the script by invoking this command:
$ THEANO_FLAGS=floatX=float32,device=gpu0,lib.cnmem=0.8,nvcc.flags=-D_FORCE_INLINES,dnn.enabled=True python mnist_gpu_test_script.py
Alternatively, you could have specified the settings in the theano configuration file (.theanorc):
[global]floatX = float32device = gpu[lib]cnmem = 0.8[dnn]enabled = true[nvcc]flags = -D_FORCE_INLINES
Note: The flag “nvcc.flags=-D_FORCE_INLINES” is very important for Ubuntu 16.04 since CUDA 8.0 does not seem to support the default gcc version (5.4.0) (see Ubuntu 16.04 and CUDA and Fix for glibc 2.23). This is a tricky hack for now. Alternatively, you could also link CUDA to an older gcc version by using the update-alternatives (see here).
Eventually, here is the output:
Note: It is also interesting to monitor NVIDIA’s system management interface to see if there are some activities e.g. memory usage, running processes and GPU utilization.
$ watch -n1 nvidia-smi
Conclusion/Outlook
In this blog post, I described step by step how to set up a deep learning environment on AWS. Contact me via twitter @datitran if something is unclear or just follow me. In the next article, I will focus on how to automate those steps explained above by using docker and therefore immensely speed up the setup process.