Prepare Your Data Science Environment

Albarqawi
Albarqawi
Feb 28 · 6 min read
Anaconda environments header. Built on top of a vector image from freepik

Content

Set the context

There are many cases where you need to set up your environment, such as working on a local machine, preparing a docker container, or working directly on the server, which can be cheaper and more flexible than off-the-shelf platforms.

It is not that complicated to prepare your machine learning (ML) environment with distribution tools like Anaconda, which helps to manage and distribute Python and R libraries.

Anaconda resolves dependencies; when you install a new library or framework, then the dependency will be installed as well. For example, a command to install pandas will install numPy implicitly.

We are going to go through the steps to prepare your Python ML environment and launch notebooks to write code. Besides, instruction for Pycharm integration with Anaconda and preparing Amazon’s Elastic Compute Cloud (EC2) instances for jupyter notebook.

Installation

Anaconda provides an installer for the major operating systems. Installation should be easy, run the execution file and follow the default installer instructions.

The major components:

Next are the steps to set up and execute Anaconda commands on macOS, ubuntu, and windows.

MacOS

To execute Anaconda commands, you have to define the path in your terminal profiles — usually, the Anaconda installer will handle this.

Edit bash_profile command:

nano ~/.bash_profile

Make sure the anaconda path available [or] add it by yourself (The path can change based on your installation):

export PATH="/opt/anaconda3/bin:$PATH"

3. (Optional — for zsh shell) if your terminal uses zsh shell; then the bash_profile will not be the default path and an extra step required.

zsh terminal

Use this command to edit the path:

nano ~/.zshrc

Add the path to anaconda bin — if not available:

export PATH="/opt/anaconda3/bin:$PATH"

[OR] add the following script to point to the bash profile:

if [ -f ~/.bash_profile ]; then. ~/.bash_profile;fi

Ubuntu

For other operating systems it is straightforward to run the installer. However, on a linux based OS you will use the commands to install Anaconda:

wget <installer_url_from_anaconda.com>
sh <installation_file.sh>
rm <installation_file.sh>

Then you have to define the terminal path — usually defined by the installer:

2. Open the “bashrc” paths:

nano ~/.bashrc

3. Verify the path to anaconda available or add it to the end of the file:

export PATH=~/anaconda3/bin:$PATH

4. Refresh the terminal source:

source ~/.bashrc

Windows

To start writing Anaconda commands on a windows machine, go to the start menu and search for “Anaconda Prompt”.

Anaconda prompt on windows machines — Source: anaconda.com

For a user interface with the major environment actions open “Anaconda Navigator”.

Dekstop navigator — Source: anaconda.com

Manage your environment

Anaconda allows you to create a separate environment for your projects. There are two ways to manage the environments: (1) using the commands, which are flexible and portable or (2) using a user interface to control the major actions.

Way1 — Anaconda commands

Open the terminal and execute the commands.

conda create -n env_name
conda create -n env_name python=3.8
conda env remove -n env_name
conda env list
conda activate env_name
conda env export > environment.yaml
conda env create -f environment.yaml
pip freeze > requirements.txt
pip install -r requirements.txt

Way2 — Anaconda navigator interface

Anaconda navigator

2. Go to environments section to create, clone, import, or remove any env.

Navigator actions

3. Click “create” to start a new Python or R environment.

Navigator create window

Anaconda commands guide

Install the major libraries

source: https://ahmadai.com/miner/
conda create -n env_name
conda active env_name
conda install numpy
conda install pandas
conda install -c conda-forge scikit-learn
pip install --upgrade pippip install tensorflow
conda install pytorch torchvision torchaudio -c pytorch

NOTE: Python 3.9 users will need to add ‘-c=conda-forge’ for installation

Start jupyter notebook

jupyter notebook
jupyter notebook --ip=0.0.0.0 --no-browser

Extras

Pycharm integration

You can connect your Pycharm project to an existing anaconda environment.

Select “previous configured interpreter” and click the three dots.

Select the python file for the desired environment from the interpreter menu.

If the menu is empty you can click the three dots and navigate to “anaconda3/envs/environment_name/bin/python”

Go to preferences and select “Python Interpreter”.

Look for the environment in the drop-down menu or add if not available from the settings button.

Environment location: “anaconda3/envs/environment_name/bin/python”

Prepare EC2 instance

EC2 instance from the marketplace
Jupyter notebook rule

Add a custom TCP rule with port range “8888”; For the source section, it is recommended to add only your IP.

All the steps were tested by me before writing the article; Hopefully, you find this blog useful to start your machine learning projects.

Nerd For Tech

From Confusion to Clarification

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store