Getting started with Python (for Data Science & Machine Learning)
Prologue: If you have a basic understanding of programming and are eager to get into the domain of Artificial Intelligence, Machine Learning and Data Science, among many other, but is baffled by where to start, then this article is absolutely for you.
Why use Python?
“ Python is a widely used high-level, general-purpose, interpreted, dynamic programming language. Its design philosophy emphasizes code readability, and its syntax allows programmers to express concepts in fewer lines of code than would be possible in languages such as C++ or Java. The language provides constructs intended to enable clear programs on both a small and large scale.”
The following guide is divided into 7 steps. 7 comprehensive steps to get you up and running Python scripts in your machine and tackling real world-problems!
Step 1: Install Anaconda
But what is Anaconda? I am kinda afraid of snakes just so that you know…
“With over 6 million users, the open source Anaconda Distribution is the easiest way to do Python data science and machine learning. It includes hundreds of popular data science packages and the conda package and virtual environment manager for Windows, Linux, and MacOS. Conda makes it quick and easy to install, run, and upgrade complex data science and machine learning environments like Scikit-learn, TensorFlow, and SciPy. Anaconda Distribution is the foundation of millions of data science projects as well as Amazon Web Services’ Machine Learning AMIs and Anaconda for Microsoft on Azure and Windows.”
You can download Anaconda from here. (Download the Python 3.6 version)
Step 2: Setup PATH environment variable
What is this shit? Why do we need it?
“Environment variables are set to allow access to command line tools and to enable other tools to interact with SDKs more easily. PATH specifies the directories in which executable programs are located on the machine that can be started without knowing and typing the whole path to the file on the command line.”
Fair enough. How do I set it up?
- Right-Click on ‘My Computer’
- Click on Properties
- Click on Advanced system settings
- Click on Environment variables
- Click on New
- Set Variable name to Path
- Set Variable value to the directory of the Scripts folder inside Anaconda
Here’s a picture guide and what my setup looks like :
You can check whether if Python has properly installed in your machine by heading over to Command Line and typing python. If you get something like this, you are good to go. Also shows the version of Python running in your machine.
Step 3: Setting up our Text Editor
My text editor of choice is Sublime Text 3. You can download it here. I would highly recommend watching this video which will help you setup and beautify your ST3.
Note: You will need to install SublimeREPL package to run your Python code because the default ST3 console sometimes fail unexpectedly.
Step 4: Installing Dependencies
Welcome to the world of packages/libraries! Simply put, and to quote Siraj, “Dependencies are packages that our code depends on.” There are tons and tons of packages out there that will help you write your Python script. Each library serves a specific purpose.
There is only one rule however, you need to install them before using them.
There are quite a few ways to install packages. I prefer pip install.
But what is pip install?
pip is a package management system used to install and manage software packages written in Python.
Cool Fact: pip is a recursive acronym that can stand for either “Pip Installs Packages” or “Pip Installs Python”.
Okay cool. But how do I pip install a library?
- Head over to your Scripts folder inside your Anaconda directory.
- Write cmd on top and press Enter. This will open the command line in that directory.
- You can read the documentation/github/stack overflow for a library to understand what command to write to install it.
Most of the time it is usually “pip install package_name”
- Once the package has installed, you can import it in ST3.
Here’s a picture guide:
Some libraries require an additional step before the pip install. You need to download their respective wheel(.whl) file from here, put the file in your Scripts folder, head over to the cmd and pip install the file. (Hack: just type a few letters of the file name and hit ‘Tab’ to auto-complete the file name)
Step 5: Learning to read Documentations
You will need to google a lot. More than half of the time, developers are just trying to find solutions to their problems in Stack Overflow, reading documentations of a package or its Github readme to know the details of its different modules and how it can be implemented.
Step 6: Write your Python script
Figure out a problem that you would like to solve. Google to find if there are any available Python libraries for this. If not directly, find out what combination of libraries can be used to achieve your goal. Install them in your machine. Read its documentation and even sample code if available to understand its usage.
Step 7: Follow really smart people
It is very essential to stay updated with the collective knowledge of the developer’s community world-wide.
I follow these people on Twitter to get my daily dose of inspiration:
“The woods are lovely, dark and deep. But I have promises to keep, and miles to go before I sleep.” — Robert Frost
Keep Calm and Keep Coding!