Deep Learning Setup in Arch Linux: From Start To Finish with PyTorch + TensorFlow + Nvidia CUDA + Anaconda
This article serves two purposes: 1) give a — almost complete — step by step guide on installing Arch Linux with PyTorch + Nvidia CUDA/cuDNN + Anaconda + TensorFlow CPU(for tensorboard) for anyone looking into using Arch for that and 2) I always end up searching for the same information when I install Arch so I need a guide tailored for my machine!
Some troubleshooting can be found at the end of this article but really it only involves the issues I had to deal with during this installation so it’s not set in stone, just what worked for me. Some aren’t that important but I do have a slight OCD with my
dmesg -l err,warn being relatively clean i.e., no errors whatsoever and no warnings that suggest instability.
Obviously the proper way to install Arch is to follow the official guide as you learn a lot in the process by reading the material. The Arch community is very active and most questions/problems you might have will be answered in the forums.
OK now that we got all this out of the way let’s jump right in!
Installation medium booted successfully and we are looking at the prompt in the virtual environment.
Load a console keymap to match your keyboard layout. List available layouts using
ls /usr/share/kbd/keymaps/**/*.map.gz. Load by running
Verify UEFI mode by checking if the directory is populated.
# ls /sys/firmware/efi/efivars
Let’s ping something
The install media provides
netcl . The following will bring up a gui to setup your wireless connection, assuming you’re using wireless.
# ping www.google.com
Set the clock
Update the system clock and check its status.
# timedatectl set-ntp true
# timedatectl status
Ready the disk
Partition the disk using
parted. I want my system to have swap, root and an EFI partition which is required anyway. If
parted complains about alignment you can either ignore(don’t) or fix it by playing around by rounding to powers of 2 until the warnings stop. Run
lsblk to find your drive.
# parted /dev/sda
mkpart ESP fat32 1MiB 513MiB
set 1 boot on
mkpart primary linux-swap 513MiB 4.5GiB
mkpart primary ext4 4.5GiB 100%
Enable swap and check using
# mkswap /dev/sda2
# swapon /dev/sda2
Format partitions, guide mentions that if using
parted it’s not needed but I did it anyway.
# mkfs.fat -F32 /dev/sda1
# mkfs.ext4 /dev/sda3
Mount it all
# mount /dev/sda3 /mnt
# mkdir /mnt/boot
# mount /dev/sda1 /mnt/boot
Pre-installation complete! We can now install Arch Linux.
Install base packages
# pacstrap /mnt base base-devel
mnt/etc/fstab and check it matches docs.
# genfstab -U /mnt >> /mnt/etc/fstab
Chroot into the new system
# arch-chroot /mnt
Timezone and hwclock
# ln -sf /usr/share/zoneinfo/Core Worlds/Coruscant /etc/localtime
# hwclock --systohc
en_US.UTF-8 UTF-8 in
/etc/locale.gen or whatever language you want. Run
locale-gen to generate the file. Set the
If the keyboard layout was changed make changes permanent by editing
Add matching entry to
127.0.1.1 myhostname.localdomain myhostname
I like to use
NetworkManager for managing my network connections. For details read the official docs.
# pacman -S networkmanager
Creating a new initramfs is not required but I’ve had issues in the past so I re-create it at this point.
# mkinitcpio -p linux
Set a root password by running
grub as my boot loader. Given that I have Intel CPU I also enable microcode updates by installing
# pacman -S grub intel-ucode
grub UEFI application.
# grub-install --target=x86_64-efi --efi-directory=boot/
Generate the config file. Microcode updates will be added automatically.
# grub-mkconfig -o /boot/grub/grub.cfg
# umount -R /mnt
Hopefully everything worked just fine and we are logged into a working Arch environment. Now to the post-installation adventure!
It’s always a good idea to not run things as root. Add a new user.
# useradd -m -G wheel -s /bin/bash yoda
# passwd yoda
To grant sudo access run
visudo and add
yoda ALL=(ALL) ALL.
Trim support for SSD
Add support for SSD trim. For more info check docs. Verify trim support by running
lsblk -D. disc-gran/disc-max should not be empty if enabled.
util-linux package provides
fstrim.timer. Enabling the timer will activate the service weekly.
# systemctl start fstrim.timer
# systemctl enable fstrim.timer
Enable NetworkManager service
After enabling run
nmtui to connect via gui.
# systemctl start NetworkManager.service
# systemctl enable NetworkManager.service
iptables is already installed. I like
ufw. More info at docs.
# pacman -S ufw
# ufw enable // only once, when package is installed
# systemctl start ufw
# systemctl enable ufw
Install Xorg display server if you want a desktop environment later.
# pacman -S xorg-server xorg-xinit xterm
The Deep Learning Stuff
How to install and configure Nvidia drivers, CUDA, cuDNN, Anaconda, TensorFlow and PyTorch with a sprinkle of troubleshooting at the end.
To identify your card run
lspci | grep -e VGA -e 3D. I have Nvidia GeForce GTX 960. Install the drivers.
# pacman -S nvidia nvidia-utils
nvidia package contains a file which blacklists the
nouveau module, so rebooting is necessary. In addition there is now support for DRM kernel mode switching.
To enable add the
nvidia-drm.modeset=1 kernel parameter in
etc/defaults/grub, and add
nvidia_drm to your initramfs in
/etc/mkinicpio.conf. The following will update grub, initramfs and also configure Xorg.
# grub-mkconfig -o /boot/grub/grub.cfg
# mkinitcpio -p linux
To avoid the possibility of forgetting to update initramfs after updates I created a
pacman hook as mentioned here.
*** see troubleshooting for resolution and framebuffer issues.
CUDA — cuDNN
Super easy just works out of the box. Read the docs if you want and you’re done by simply doing the following.
# pacman -S cuda cudnn
Installer suggests to logout out so it can be put to the path. Once logged back in we can test the system by running the samples provided by Nvidia; namely
cuda package places files in
/opt/cuda. Copy the
samples/ folder to the home directory and run
samples/bin/. The output should look like the following; it should pass the test.
Follow the instructions on the website to download. If using terminal then do the following and just go through the steps. You can create a
conda environment to host your deep learning stuff but I don’t bother with that.
# wget https://repo.continuum.io/archive/Anaconda3-4.4.0-Linux-
# bash https://repo.continuum.io/archive/Anaconda3-4.4.0-Linux-
I install TensorFlow mainly for tensorboard visualization so the CPU version is enough for me. Great documentation at the website if you want to install the GPU version.
# pip install --ignore-installed --upgrade
Check that it’s working by running the following python code.
import tensorflow as tf
hello = tf.constant('Hello, TensorFlow!')
sess = tf.Session()
Follow the instructions given at the PyTorch website for your installation.
# conda install pytorch torchvision -c pytorch
pytorch channel is added we can easily update the package by running
conda update pytorch torchvision in the future. To test I run the following python code which also shows if the GPU can be accessed.
Your system is not currently configured to drive a VGA console on the primary VGA device. The NVIDIA Linux graphics driver requires the use of a text-mode VGA console. Use of other console drivers including, but not limited to, vesafb, may result in corruption and stability problems, and is not supported.
This warning can be seen when running
dmesg -l err,warn. I wanted that to go cause I don’t want to deal with potential corruptions and instabilities in the future. I also found it was not allowing me to have a high-res console (more on that later).
You can find info about this here. The NVIDIA driver does not provide an
fbdev driver for the high-resolution console for the kernel compiled in
vesafb module. However, the kernel compiled in
efifb module supports high-resolution
nvidia console on EFI systems. Following the instructions here didn’t solve it for me. After searching I found this reply to Nvidia’s forums by a moderator.
The message is a little misleading in UEFI mode. What it means it that the GPU was initialized to a graphical mode using the legacy VGA BIOS, regardless of whether the system was booted in UEFI mode or not. Typically this happens if the Compatibility Support Module (CSM) is enabled in the system BIOS. If you have an option to disable CSM in the SBIOS, please try that.
This solved my problem. In my BIOS I had to change the following
BIOS Config->Windows8->CMS Support->Never.
Low resolution console. The reason why this didn’t work for me was because of the problem with the CSM in BIOS. After fixing that, following the instructions in that post resulted in a high-res console. In a nutshell I changed the following and regenerated the grub config.
GRUB_CMDLINE_LINUX="nouveau.modeset=0 rd.driver.blacklist=nouveau video=vesa:off rhgb quiet"