Installing Anaconda for multiple users

Peter Roche
6 min readJul 26, 2018

Anaconda is a great way to manage your installation of Python and its packages. By default Anaconda installs into the home directory of the user, but you can easily configure it to install to a different location.

Installing in your home directory is fine if you are the only user, but what if you have multiple users on a server, or you have applications that run under a different user to you? You don’t want to have to configure Anaconda environments for each user, you simply want to share the exact same environment across the users who require it. The steps below will show you one way to do just that.

To set the scene a little more, a recent project required me, system user ‘peter’, to use Python when developing a data analysis application. In production, the Python scripts would be called from an API managed by ‘tomcat’. I needed to create a set-up that enabled both ‘peter’ and ‘tomcat’, and any other users, to run Python in a managed ‘conda’ environment.

Installing Anaconda

  • Firstly, download the latest version of Anaconda (currently 3 5.0.1) from the Anaconda site

You’ll need to run the following commands as sudo

  • Create a system user called ‘anaconda’
adduser anaconda 
  • Next, create a folder where Anaconda will be installed
mkdir /opt/anaconda
  • You might need to modify the permissions of the downloaded Anaconda package so it is executable. So, from wherever you downloaded Anaconda
chmod +x Anaconda3-5.0.1-Linux-x86_64.sh
  • When running the install script, approve the licensing and also choose the installation folder to be /opt/anaconda.
./Anaconda3-5.0.1-Linux-x86_64.sh
  • Finally, we need to update the permissions of the folder where Anaconda is installed, so other users, other than ‘anaconda’, can use it too
  • Change the ownership of the /opt/anaconda folder, and all subfolders, from root to ‘anaconda’,
chown -R anaconda:anaconda /opt/anaconda 
  • Remove write permission for ‘group’ and ‘others’ — we don’t want them messing up our anaconda directory!
chmod -R go-w /opt/anaconda 
  • Explicitly give ‘group’ and ‘others’ read and execute permissions
chmod -R go+rX /opt/anaconda

Anaconda is now installed and ready to be used!

There are a couple of nice ways you can now use Anaconda. There are probably lots more ways, but I am just going to outline two.

The main reason for using Anaconda is to create ‘conda’ environments that manage all the package dependencies in your applications. If you are not totally familiar with conda, checkout this conda concept outline. Conda is the package manager of Anaconda, the Python distribution provided by Continuum Analytics. Quoting directly from the conda website:

A conda environment is a directory that contains a specific collection of conda packages that you have installed. For example, you may have one environment with NumPy 1.7 and its dependencies, and another environment with NumPy 1.6 for legacy testing. If you change one environment, your other environments are not affected. You can easily activate or deactivate environments, which is how you switch between them. You can also share your environment with someone by giving them a copy of your environment.yaml file. For more information, see Managing environments.

Environment for all users

This is typically what you would do on a server where you have multiple users who require access to a common environment. To create a conda environment that any system user can activate, simply switch user to anaconda and create the environment:

su anaconda
conda create -n shared_env package1 package2 package3

This will create an environment called ‘shared_env’ with the three packages listed. The environment will be saved in /opt/anaconda/envs/shared_env/

Because of the permissions we set earlier on the /opt/anaconda folder, any system user can now activate this conda environment by simply issuing the following command:

source /opt/anaconda/bin/activate shared_env

Environment for a single user

System users can also create their own environments, for example when developing a new application. To create a new environment, called ‘my_env’, simply issue the following command:

/opt/anaconda/bin/conda create -n my_env package1 package2 package3

This will create the environment in a conda directory in the user’s home directory, e.g. /home/user/.conda/envs

Installing non-conda packages

Not all packages are available with conda. I recently wanted to use PySolr for a new project. However, it wasn’t available from the list of Anaconda packages.

Fortunately, PySolr is available using using the pip package manager. And doubly fortunately, pip is installed along with conda by Anaconda, which means some non-conda packages can actually be installed into your conda environment. Hope that’s not too confusing!

To install PySolr, or another package using pip, all that is required is to issue the following commands (N.B. it is better to switch user to anaconda to help pip with caching and permissions during the install process)

su anaconda
/opt/anaconda/bin/pip install pysolr

This installed PySolr into the ‘root’ conda environment, and allowed me to start using PySolr for some development work.

Edit #1: 29 October 2019

The above ideas are still valid, but I have more recently been using a different approach to manage multi-user conda environments, which I want to share here.

The issue with the single user environment above is that new packages and environments get added to the user’s home directory (/home/user/.conda/…). This is fine if you really are the only user, but if the server has many users, and those users all save new packages into their home directory, then there will potentially be many copies of the same packages — for example, lots of users with the same install of pandas 0.25.2. This doesn’t seem right, and will start to take up disk space unnecessarily.

A better way is to have some shared disk space, a folder somewhere, and each user save the packages they need to the shared folder. This will mean there is only one single package and each users essentially share it.

Firstly, create some folders for the packages and environments, and set the permissions to enable other users to write to it:

su mkdir -p /apps/conda/pkgs
su chmod -R oug+rwx apps
mkdir -p /apps/conda/$USER/envs

These are the shared folders where the requested conda packages will be saved.

Next, update your .bashrc to use these new folders:

export CONDA_PKGS_DIRS="/apps/conda/pkgs","/opt/anaconda/pkgs","/home/$USER/.conda/pkgs"
export CONDA_ENVS_DIRS="/apps/conda/$USER/envs"

Now when each user creates a new environment, packages will be saved to the pkgs folder and the environment created in their envs folder.

There are two important things to note here:

  • Firstly, the order of the filepaths is important. Downloading and saving of packages will be performed in the first writable location, in the order specified. After setting your .bashrc, you can see the package cache and environment directories are set correctly with the “conda info” command
conda info command results
  • Secondly, the envs directory needs to be in the same file system as the packages, which ensures it is possible to create hard links between the source files in all the newly created conda environments. This second point is crucial and means that there are not multiple, separate copies of the same source files. Instead there is one file and each different conda environment creates a hard link to this source. You can check that this is the case by finding the inode of a file in one environment and checking all files that have that inode, see the screenshot below which finds all the files with a particular inode:

One further point to note is what if one user deletes their conda environment, won’t that delete the files for the other users? The answer is no it won’t. Because of the way hard linking works, it will simply remove the hard link from that user’s environment to the source file and not remove it for other users. This is exactly the desired behaviour.

Conclusion

There is a lot more to Anaconda but hopefully the above steps will help you get a nice clean setup from where to start your dev work and manage your production environment from a single installation of Anaconda.

--

--

Peter Roche

Loves all things Data Science and Clean Code evangelist …