Installing Python for Data Analysis on a Mac

Andre Bach
Office Analytics (formerly Yammer Analytics)
4 min readOct 16, 2014

Getting the good stuff

What’s out there

You probably have Python on your Mac already: launch the Terminal and type these things. The format below is a common convention. It means, “Type the part after the $, then hit enter. The next line shows what I got in the console when I did that command. After that, type the part after the second $, and so on.” Hopefully you get something similar what I got for both commands!

$ which python
/usr/local/bin/python
$ python --version
Python 2.7.4

That means that when you run python, you’re using a version located in the /usr/local/bin directory, and it’s version 2.7.4.

It turns out, though, that when people say “Python is great for data analysis”, they actually mean “Python, and all these packages in the Python ecosystem, together, make a great environment for data analysis.” Installing and configuring all the packages can be a pain, as setting up a development environment so often is. Here we’ll walk though the easiest way of getting your local computer all set up to do some sweet analysis.

Anaconda

We won’t be replacing or modifying the Python on your Mac already, but adding a whole new install that’ll be separate from it. Don’t worry, most of the hard stuff is handled automatically. We’re using Anaconda, a free utility that will install almost all the useful packages and does some other cool stuff. Go to that link, then click download, then click “I WANT PYTHON 3.4”, and download the graphical installer.

Open the installer (make sure you’ve got a file named “Anaconda3-<stuff>”, not “Anaconda-<stuff>”) and follow the instructions. Did it say is was successful? I hope so. (If it says Anaconda is already installed, the easiest way forward is actually to uninstall the old version, then run the installer again.) Let’s find out: open a new terminal window, and try these commands:

$ which python
/Users/gerbil/anaconda/bin/python
$ python --version
Python 3.4.1 :: Anaconda 2.1.0 (x86_64)
$ conda --version
conda 3.7.0

My user’s name is “gerbil”; hopefully it and the snakes will get along. Anaconda puts its python stuff in a directory named “anaconda” in your user’s home directory. With luck, those commands all work for you, and you have those versions (or newer).

Python and a bunch of packages are now installed! What packages? That “conda” command is how you can manage your packages and environment. Here’s how you can find out what you have:

$ conda list
<so many things! i'm skipping a bunch>
anaconda 2.1.0 np19py34_0
conda 3.7.0 py34_0
ipython 2.2.0 py34_1
matplotlib 1.4.0 np19py34_0
numpy 1.9.0 py34_0
pandas 0.14.1 np19py34_0
scipy 0.14.0 np19py34_0
<so many things! i'm skipping a bunch>

The first column is the package name, then the version you have installed, then some nonsense. Above I just listed a few of the packages I care about most; there are a bunch more.

You have Python, congrats!

Trying the good stuff

Okay, python, you there? There are three, increasingly fancy, ways to launch python. We’ll go through them one by one just in case anything breaks.

python

Try the simple way first. (Here >>> indicates a thing to type at the python prompt, and then press enter.)

$ python
Python 3.4.1 |Anaconda 2.1.0 (x86_64)
>>> 7 / 2
3.5
>>> [i**3 for i in range(17)]
[0, 1, 8, 27, 64, 125, 216, 343, 512, 729, 1000, 1331, 1728, 2197, 2744, 3375, 4096]

You’ve executed some python commands! Also, list comprehensions, sweet. You can get out of python and back to the shell with cntl-D.

ipython

Actually, all the cool kids use the “IPython” prompt nowadays. It gives you color coding, a better history, pretty printing, timing information, easier debugging, so many things. Using it is as easy as adding an i:

$ ipython
Python 3.4.1 |Anaconda 2.1.0 (x86_64)
IPython 2.2.0 -- An enhanced Interactive Python.
In [1]: 7 / 2
Out[1]: 3.5
In [2]: [i**3 for i in range(17)]
Out[2]:
[0,
1,
8,
27,
64,
125,
216,
343,
512,
729,
1000,
1331,
1728,
2197,
2744,
3375,
4096]

So pretty, it makes the nice list have one element per line. If you’re interested, type “?” in IPython to learn about IPython, or “help()” to learn about Python itself.

ipython notebook

Last section, when I said all the cool kids use the IPython prompt nowadays… that was a lie. Actually, all the really cool kids use IPython notebooks nowadays. Launch a notebook:

$ ipython notebook
<a bunch of stuff in the console>
File viewer before making a new notebook.

If everything is golden, a new tab will open in your web browser of choice! Click New Notebook, then try those same python commands we did above. Here, shift-enter will execute commands, not enter. Such color coding! Poke around a bit, and maybe try the User Interface Tour under the Help menu.

IPython notebooks are a nice GUI-ful, share-able, Integrated Development Environment-esque way to do Python. Our main lesson will be in an IPython notebook. I’m pumped. See you soon!

--

--

Andre Bach
Office Analytics (formerly Yammer Analytics)

Now a Data Scientist doing analytics at Yammer! I spent most of the past 7 years learning and researching particle physics at Berkeley and CERN.