Intro to Python’s Pandas

Jacqueline Kazil
The coderSnorts
Published in
2 min readFeb 22, 2015
This is a screenshot from ipython notebook that shows a sample dataframe output.

Pandas is an easy to use data structures library in the Python programming language. It is like R, but in Python. The power of using Pandas versus R is in the accessibility you have to other libraries outside of R — such as map reduce for “big data” sets.

Today PyLadies DC hosted a meetup where Clara Bennett, Software Engineer, and Tyler Jorgensen, Data Scientist, from Picwell led an Intro to Pandas class. Their materials are great and they are easy to follow along with, without having to attend the class.

The materials are presented using ipython notebook. There are two modules. The first module is the basic tutorial. The second module is a case study, which uses the same data, but introduces concepts beyond Pandas like plotting and simple modeling.

You can preview the notebooks and read through them by clicking on the links, but you can work them locally using the following steps:

To get started…

Install requirements

Then you will need the following libraries. You can do this all at once with this command:

pip install pandas “ipython[notebook]” matplotlib

List of libraries:

  • Pandas
  • IPython with notebook dependencies
  • (optional) matplotlib
  • (super optional) Virtualenv, Virtualenvwrapper

Download the repository of code from Github

Repo: https://github.com/picwell/intro_pandas

If you have git installed, use git to clone to your computer. If you don’t have git installed…

  1. Download the zip to your computer
  2. Unzip some where meaningful to you — the place where you put code
  3. Navigate to code
  4. From Terminal, navigate to the folder. Something like…
cd ~/Projects/code/classes/intro_pandas

Launch IPython notebook

From the intro_pandas folder, launch IPython notebook

ipython notebook

If this fails, then read the last couple of lines in error, you might be missing a dependency — Pyzmq, Tornado, Jinja2.

From this point, your browser should launch to the IPython notebook console.

In your browser, click on pandas_tutorial.ipynb.

From there, the notebook is really sufficient and has the rest of the notes to guide you through.

If you are interested in exercising your new Panda skills or learning more, the authors of this tutorial and/or I suggest looking at the following links:

--

--

Jacqueline Kazil
The coderSnorts

Data science, complexity, networks, rescued pups | @InnovFellows, @ThePSF, @ByteBackDC, @Pyladies, @WomenDataSci, creator of Mesa ABM lib