Basic tools for Machine Learning

Emannuel Carvalho
Cocoa Academy
Published in
5 min readOct 4, 2017

--

This post is a reeeally basic tutorial on how to set up a computer for machine learning. I decided to do it when I had once again to install all the tools to start playing around with machine learning in a recently reset computer. It might be useful if you're about to take the first steps, or if you come from areas other than technology.

After the launch of Apple's Core ML framework last June, many iOS Developers have become more interested in Machine Learning. If you're one of them, this tutorial could be a great first step for you!

The tools you will need may vary depending on the kinds of tasks you will perform. However, the tools I decided to list here are pretty much what you will most likely have to install sooner or later anyway, so I promise you won't be losing (all of) your time. It might also make you a bit more confident to start looking for, and installing other tools later.

Throughout the tutorial, I will be using a Macbook with macOS High Sierra. Everything should work quite similarly for any computer with a Unix-based OS.

Warm up 🏋🏻

Before we start actually installing the tools, we need a nice little friend to help us with the installations. It's called pip and it's a package manager for Python (by the way, I'm assuming you already have Python installed in your computer, if you don't, check this out). You can get pip by simply saving this file and running it in your terminal with the following command:

sudo python get-pip.py

Environment 🌳

In order to organize our tools and their respective versions, we should use "virtual environments". Maybe you won't notice the importance of doing this at first but, trust me, not using them might cost you quite some time later — and maybe even some hours of sound sleep.

I use virtualenv to isolate my environments. In order to install it, run the following command:

sudo pip install virtualenv

In order to make it easier to use different environments, I use virtualenvwrapper, which, as the name suggests, is a simple wrapper around virtualenv. You can install it with the command:

sudo pip install virtualenvwrapper

Don't worry if you come across the following error:

Error trying to uninstall `six`

All you need to do is run the following command (which tells pip to ignore the installed six):

sudo pip install virtualenvwrapper --ignore-installed six

In order to better organise the envs we will create, there's still some setup to do with the virtualenvwrapper:

export WORKON_HOME=~/Envs

mkdir -p $WORKON_HOME

source /usr/local/bin/virtualenvwrapper.sh

This will create an Envs directory, where all of our envs and everything about them will be kept safe.

Now that we are all set up, we can create our first env with the command:

mkvirtualenv ml

If everything worked fine, you will have a (ml) before the command line prompt. If you wanna "leave" the env you just created, you can run deactivate. When you want to return to it: workon ml.

Of course, if you wanna create your env with a different name, all you have to do is replace ml with the name you wanna use.

The tools 🛠

The tools we will install are:

When installing each of those tools, make sure you are in the environment we just created (if you followed the tutorial, there must be an (ml) right before the command line prompt. If it is not showing, run workon ml in order to get in the environment. All the tools we be available only inside the environment.

Jupyter

Jupyter allows you to edit and run your python code in your browser, organise the documentation along with the code and it also makes it really easy to create some material that can be presented.

pip install jupyter

Numpy

Numpy is a python library for scientific computing. It is more than necessary for any work on machine learning or data science.

pip install numpy

Scikit-learn

Scikit-learn is a tool for data mining and data analysis. It can really help a lot.

pip install scikit-learn

Matplotlib

Plotting data is crucial whether you wanna have a first insight upon the task you're working on or when you need to show an analysis to someone. Matplotlib is the tool for the job.

pip install matplotlib

Tensorflow

If you're interested in deep learning, you might not have a choice but to use tensorflow. It is a library for numerical computing and is widely used for building neural nets.

pip install tensorflow

Keras

Again, if you're into deep learning, Keras can make a huge difference. It has amazing apis for building and training models, getting data and making predictions. It uses tensorflow or theano as its backend — in our case, we will be using tensorflow.

pip install keras

You're all set! 🚀

Wow. That's been a lot of "pip installs"! Good news is you're now pretty much equipped with everything you need to start diving into the amazing world of machine learning.

If you wanna start with the classic example, I recommend you take a look at this post, where you will use many of the tools I listed here in order to create a model for handwritten digit recognition with the MNIST data set — which is basically the "hello world" of machine learning.

For the iOS devs out there: in the following posts I will show how to create a model for a specific task and then integrate it to an iOS app using coremltools and Xcode 9.

Thank you for reading!

If you have any comments or suggestions, or if I messed up somewhere, let me know in the comments. That would be really appreciated!

You can always find me here.

See ya! 😎

--

--