Exploring your pandas DataFrame

I’m going to start with telling you about first few useful commands that allow you to explore your data and see what you have in it. I use the data on calls for service from New Orleans from 2015 (available on https://data.nola.gov/).

Seeing what columns you have in you DataFrame

df.columns

Type of object in each column

df.types

Viewing first/last rows of the DataFrame

df.head() — Lets you see the first 5 rows.
df.tail() — Lets you see the last 5 rows.
df.head(n) — You can see the first n rows.
df.tail(n) — You can see the last n rows.

Counting values from a column

Sometimes, you may want to see how many times each value in your DataFrame column appears. Use:

df[‘col1’].value_counts()

df[‘col1’].value_counts(normalize=True)

If you pass this argument, you will get each value as a fraction of the total. This may come in handy sometimes.

Now as a fraction of total — you can see that I have a lot of values so these numbers are really small

Counting missing values from a column

pd.isnull(df.col1).value_counts()

You will see that in the ‘TypeText’ column, there are no missing values.

What object do I have?

If you aren’t sure if you have a DataFrame or Series, use:

type(object)

You can find my full code on GitHub: https://github.com/kasiarachuta/Blog/blob/master/Exploring%20your%20pandas%20DataFrame.ipynb