Anaconda: Tool for Data Science
Let’s begin your Data science Journey.
Data science is the pervasive field which wants access to various tools packages, and configuring those tools is a typical task and little tough to manage all at once.
Here comes Anaconda, produced by anaconda, Inc. is designed for python and R language basically for data scientists. Anaconda is a program to manage (install, upgrade, or uninstall) packages and environments to use usually with Python. It’s simple to install packages with Anaconda and create virtual environments to work on multiple projects conveniently.
Is it going to be difficult to use??
- No, It’s easy to use and is used because it is very effective to work with python packages.
Why choose anaconda?
Anaconda comes with a bunch of tools used in data science and machine learning which can be installed with just a few clicks, you’ll be all set to start working with data. And using conda to manage your packages and environments will reduce future issues dealing with the various libraries you’ll be using. It also helps to isolate the projects with different version dependencies. Anaconda is a heavy software (around 500MB) as it came up with various tools. It bundles many of the common libraries used in commercial and scientific python work like Numpy, sklearn etc.
It gives a head-start the data science journey with all the configuration needed by a beginner data science learner.
Features
- Anaconda Navigator — It is a graphical user interface that helps open up any installed applications, such as Jupyter Notebook or VS code editor. See a snapshot of Anaconda Navigator below:
- Conda: A command-line utility for package and environment management. Mac/Linux users can use the Terminal, and Windows users can use the “Anaconda Prompt “to execute conda commands. Windows users must run the Anaconda Prompt as an Administrator.
You can check your current conda version by command given below
conda --version
- Python: The latest version of Python gets installed as an individual package.
- Anaconda Prompt: [Only for Windows] a terminal where you can use the command-line interface to manage your environments and packages.
- A bunch of applications, such as Spyder. It is an IDE geared toward scientific development. In total, over 160 scientific packages and their dependencies are also installed.
If you don’t need all the packages or need to conserve bandwidth or storage space, there is an option for you — Miniconda.
Miniconda is a smaller distribution as compared to Anaconda, which includes only conda and Python. Miniconda can do everything Anaconda is capable of, but doesn't have the preinstalled packages. Interestingly, you can anytime upgrade from Miniconda to Anaconda by using the command:
conda install anaconda
Download the installer from https://www.anaconda.com/download/. Choose the Python 3.7 or higher version, and the appropriate 64/32-bit installer.
After installation, you’re automatically in the default conda environment with all packages installed which you can check by
conda list
Commands :
- It’s best to update all the packages in the default environment.
conda upgrade --all
- To Install package
conda install PACKAGE_NAME // conda install pandas
1. You can also install multiple packages in one command.
conda install numpy scipy pandas
2. It’s also possible to specify which version of a package you want by adding the version number such as(helpful for projects with different version dependencies).
However some versions of libraries stop being developed like python2.x but there are also projects still working on old versions. This feature gives anaconda an edge over other environment managers.
conda install numpy = 1.10
- Remove package
conda remove PACKAGE_NAME
- Update package
conda update PACKAGE_NAME
- Search package
conda search '*PACKAGE*'
Conda can be used to create isolated environments for your projects. To create an environment, use the following command in your Anaconda Prompt.
conda create -n env_name
To check/list the environments existance,
conda info --envs
To create with specific version libraries
conda create -n env_name [python=X.X] [LIST_OF_PACKAGES]
- To Activate/Deactivate virtual environment
conda activate demo_env // To activate
(demo_env) ~ $conda deactivate //To deactivate environment
Saving and loading environments
A very beneficial feature is environment sharing. When sharing your code on GitHub it automatically includes environment file in your repo. so others can install all the packages/environment used in your code with ease, with the correct versions.
conda export env
you can see the name of the environment, and all the dependencies (along with versions) are listed with the above command.
To share the environment, you need to create a YAML file like below:
conda env export > environment.yaml
Now you can share this file with people who want the same environment as you.
- To create an environment from an environment file, use the following command:
conda env create -f environment.yaml
Remove the Environment
If there are environments you don’t use anymore, use the command below to remove the specified environment. (!! To remove the environment first deactivate it.!!)
conda env remove -n env_name
Share the List of Dependencies
For users not using conda, you may want to share the list of packages installed in the current environment. You can use pip
to generate such a list as requirements.txt
file using:
pip install -r requirements.txt
- you can install all the packages mentioned in the
requirements.txt
file using:
pip install -r requirements.txt
Fact: Installing pandas by itself will also install numpy since numpy is a dependency of pandas. Conda makes sure to also install any packages that are required by the package you’re installing.
Wanna know more...
Learn about pip Learn about conda
Pip installation
For more such posts, do follow out our Publication:
https://medium.com/tek-society
Also do clap! It encourages me to write better! And follow me for its next part.
Thank you!