Using .loc

5 min readSep 20, 2021

I found myself pretty deep into studying data science and still confused about how and when to use .loc. This gap in my knowledge became an annoyance while trying to preprocess a very large and confusing dataset for a group project. It was a real stumbling block during EDA(exploratory data analysis) and while working on ways to deal with null values. I wanted to fix this gap and learn more! It turns out this is a very useful tool and is actually very fun and not very difficult!

.loc allows you to specify specific rows and columns in your dataset to access specific information. This tool can be used in endless different ways and is a very fundamental thing to understand and be able to use. The basic layout for .loc is:

dataframe.loc[row:row, column_name:column_name]

Row numbers and column names are inclusive.

So, dataframe.loc[1:3, ‘column1’:’column3'] would include rows 1 through 3 and columns 1, 2, and 3.

Rows are called by their index and columns care called by their names.

To play around and practice I imported Pandas and then I downloaded the Titanic dataset from Kaggle.

I took a look at the basic information about the dataset.

.info shows us the shape of our dataframe(887, 8), our column names, our datatypes, and our null value counts(none!).

After this inital look I decided to use my new knowledge to answer some questions about our Titanic dataset.

Basic .loc exploration:

What does row 42 look like?
What if we just want to know if the person in row 42 survived?
What if we just want to know their age?
How do we see their age and sex at the same time?
What do rows 10–20 look like?
What about just name, sex, and age for rows 10–20?

Deeper Exploration:

How many children between the ages of 1 and 5 died?
How many men on board had 2 or more siblings or spouses with them?
How many children in Pclass 3 died?
What percentage of married women survived?
What is the average fare paid by Pclass 1?

Basic .loc exploration:

What does row 42 look like?

We specify the specific row that we want to see and then : means we want to see all columns.

What if we just want to know if the person in row 42 survived?

We indicate the row that we would like to see and the column we want. Our dataset tells us that 0=did not survive and 1= did survive so our person in row 42 did survive!

What if we just want to know their age?

How do we see their age and sex at the same time?

We separate the two columns with a , to indicate that we only want to see these two columns. If they were separated by : it would return all columns from Age to Sex.

What about just name, sex, and age for rows 10–20?

The : is returning all columns between Name and Age. In this case that is Name, Sex, and Age. Remember that everything is inclusive.

Deeper .loc Exploration:

This is where I thought this tool became really fun! In my group project we used things like the following examples to build functions for preprocessing our data.

I find it easiest to break the question down into individual steps. I ran the code after each addition to make sure it was returning what I wanted.