Setup project for Machine Learning
I have spent quite sometime learning about Machine Learning but I have never documented how to setup a project for Machine Learning. Here I am trying to document it step by step, so I can come back and read it in case I forgot how to do it.
Some tools that is common in Machine Learning.
- Miniconda : It is a minimized version of Anaconda. Think miniconda as a package management tool.
- Pandas : A data analysis and manipulation tool.
- Numpy : An array manipulation or numerical computing tool with optimization.
- Matplotlib : Data visualization tool.
- scikit-learn : A Machine Learning tool.
- Jupyter notebook : A notebook allow you to write code for Machine Learning and documentation
As miniconda is a package management tool, libraries such Pandas, Numpy, Matplotlib, scikit-learn and Jupyter notebook can be installed through miniconda without install them individually.
- Install miniconda, to install miniconda at local machine for specific platform here is instruction from offical site.
- Once miniconda had fully installed, create a folder for your project at local machine.
- Use terminal & command line to go to your project folder and run command :
conda create --prefix ./env pandas numpy matplotlib scikit-learn jupyter
This command tell miniconda to install Pandas, Numpy, Matplotlib, scikit-learn and Jupyter notebook, as well as create a folder name env in directory. With miniconda, project need to work in an environment which contain all libraries that are installed, otherwise your project will not be able to use libraries. That is the reason here we use
to tell miniconda where is the environment folder and create one if not exists.
Follow miniconda instruction and wait until all packages are installed.
4. Before we can working on a project, we need to activate environment, as miniconda only work in environment. To do so, we can run command
conda activate ./env
This tell miniconda to activate the environment which had Pandas, Numpy, Matplotlib, scikit-learn and Jupyter notebook installed. A way to make sure you are in a right environment, you can run command
conda env list
with output like
These are all environments that are created. Asterisk(*) indicate which environment you are current in. With
conda activate command
./env as parameter.
Once you activate your environment, you can start working on project by running command
This will start jupyter notebook
with jupyter notebook you can manage folders and files.
5. Make sure installed libraries are working for project. Create a new python file
and add following code
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
to cell then run cell or press
L-Shift + Enter it might take a while, but without error then it means libraries are import correctly and ready to be used.
Export and import your environment
Miniconda allow you to export your environment and share with other people, in addition, you don’t need to type library’s name every time you create a new project. To export your environment.
- Make sure you are in the environment you want to export by running command
conda activate [path to env folder]
- Run command
conda env export > environment.ymlthis will generate a yml file name environment in directory.
- Share the file to other people.
To create miniconda environment from environment file
- Run command
conda env create -p ./env -f environment.yml
-pis as same as
-fis point to your exported environment file
conda create command is to create a new environment
conda env create command is to create a new environment from environment file