Installation Instructions (Mac)
Data Science — General Assembly (Yasin Dara)
Instructions for Macs, only.
Step 1: Install Homebrew (Instructions: http://brew.sh/)
ruby -e “$(curl -fsSL https://raw.github.com/Homebrew/homebrew/go/install)"
Do not forget to run $ brew doctor after you finish the installation, to learn about potential issues you may have.
Step 2: Install git using homebrew
brew install git
brew update
Step 4: Install mysql using homebrew
brew install mysql
Step 5: Confirm that you have mysql installed on your mac, by typing
mysql
at the terminal. If you get an error such as “command not found”, you will need to configure your PATH variable, as follows:
$ sudo PATH=/usr/local/mysql/bin/:$PATH
Step 6: Once you have mysql and homebrew installed, you will need to confirm that you have pip (a python based package manager) on your computer. To do so, type
pip
at the terminal. If you get a “command not found” error, follow the instructions here to get pip installed.
Step 7: Install virtualenv for python (a python virtual environment). It is recommended that you read the blog post here, if you haven’t worked with virtual environments in python before.
sudo pip install virtualenv
Navigate to a directory of your choosing, and remember the path of that directory. You will be using this as your working folder for assignments in this class.
Now, initialize virtualenv in this directory:
virtualenv env
Step 8: Using git, which you previously installed, obtain the requirements.txt file from the class repository. I will not be providing instructions for this step, because learning to use git (even if you’re learning it on the fly right now) is definitely essential for data scientists and engineers. If you’re sneaky about it and you don’t want to clone the class repository for some reason, you can use wget or curl to obtain the file.
The file requirements.txt is located in the DS-LA-03/src/lesson01 folder.
Here is a link to the class github repository:
https://github.com/ga-students/DS-LA-03/
Step 9: In preparation for installing a python virtual environment.
If you have OSX Mavericks and are using the virtual env, please run:
Because the numpy installer is broken for the latest version of OSX (something to do with XCode 5.1).
sudo ARCHFLAGS=-Wno-error=unused-command-line-argument-hard-error-in-future env/bin/pip install --upgrade numpy
numpy has been removed from the requirements.txt file.
Install numpy generically, systemwide:
sudo ARCHFLAGS=-Wno-error=unused-command-line-argument-hard-error-in-future pip install --upgrade numpy
This is usually where the @!$% hits the fan. Move the requirements.txt file relative to your python virtualenv env directory, or ensure that you know the path to the requirements.txt file.
Attempt to install requirements for the virtual environment:
sudo ARCHFLAGS=-Wno-error=unused-command-line-argument-hard-error-in-future env/bin/pip install -r requirements.txt
At this point you may deviate from these instructions at will to fix errors on your machine. I have designed this virtual environment with OSX Mavericks 10.9.2 installed on my machine. I’ve tested it on OSX Version 10.7.5 and higher, and it seems to work.
Ultimately, you are responsible for ensuring that all the packages in the requirements.txt file install properly on your computer. Google, Stack Overflow, your peers, and I, will help you along the way.
Important: If you wish to not use the virtual environment, that’s fine. You can find a complete list of python dependencies that need to be installed on the presentation PDF for lesson 1, on the course repository.
Step 10: If (and only if) you are on OSX Mavericks you will need to install sci-kit learn separately, because of issues with the way Mavericks handles gcc compiler flags. Run:
export CFLAGS=-Qunused-arguments
export CPPFLAGS=-Qunused-arguments
sudo -E pip install -U scikit-learn
Step 11: You will also need to install other software on your computer.
Obtain an FTP Client. Your choice. I use Transmit, but it’s not free.
Obtain MySQL Workbench.
Obtain R. I use the Berkeley mirror.
Step 12: Begin the exercise here:
https://github.com/ga-students/DS-LA-03/wiki/Lesson-01-Command-line-tutorial
Email me when S.I. Yasin Dara publishes or recommends stories