The Name of the Game

How to add, update and remove index names on pandas DataFrames

Eric Ness
When I Work Data
4 min readMar 18, 2019

--

Photo by Florencia Viadana on Unsplash

There are several subtleties to DataFrame indexes in pandas. They contain multiple levels, different data types and can be transformed in dozens of ways. There is one attribute of these indexes that is often ignored — their name!

Indexes have labels for each row or column that indicate the meaning of the associated data values. A DataFrame of users would likely have rows indexed by user_id and index labels of 1, 2, 3 and so forth. The column index would have labels like last_login and user_type.

Index names specify within the DataFrame the type of data the row or column labels represent. For example, the same row index on the users DataFrame can also represent the row number, a foreign key or almost anything else. Without an index name specifying what the labels mean it’s impossible to say. This can lead to errors down the line if the users DataFrame is joined to another DataFrame on the mistaken assumption that both sets of row labels have the same meaning.

Set Up

We’re going to work through how to add, update and remove names from indexes. We’ll be looking at a variety of DataFrames in this story. To begin, we’ll set up a function to display the contents as well as the names of the indexes. The remaining snippets of code are continuations and require previous snippets to execute correctly. The complete code for this story is available on Github.

Unnamed

Here is some code to create a simple pandas.DataFrame. The indexes don’t have names since those are not set.

   a  b
0 0 0
1 1 1
2 2 2
3 3 3
4 4 4
Row Index Name: None
Column Index Name: None

Named At Creation

One way to set the index names is during DataFrame creation. Notice the name parameters for the row and column indexes.

col_name  a  b
row_name
0 0 0
1 1 1
2 2 2
3 3 3
4 4 4
Row Index Name: row_name
Column Index Name: col_name

This changed the structure of the DataFrame. Both the row and column indexes have a name. The name of the column index lines up with the column names and the name of the row index sits on top of the row labels.

Named by Pivot

Some operations like pivot can change the names of indexes. Here’s code to create a DataFrame that we can pivot.

   customer product  amount
0 10 phone 5.0
1 11 phone 10.0
2 10 tv 7.0
3 11 laptop 12.0
4 12 tv 3.0
5 12 phone 9.0
Row Index Name: None
Column Index Name: None

Notice that so far the row and column indexes don’t have names. Now we’ll call pivot on the DataFrame and see what happens.

product   laptop  phone   tv
customer
10 0.0 5.0 7.0
11 12.0 10.0 0.0
12 0.0 9.0 3.0
Row Index Name: customer
Column Index Name: product

Now the column labels product and customer have been moved to the column index name and row index name after the pivot operation. This can be handy, but we’d like the option to update these names in case they no longer make sense.

Named by Update

The rename_axis function allows us to update the name of either the row or column index. The index parameter updates the name of the row index and the columns parameter updates the name of the column index.

device   laptop  phone   tv
account
10 0.0 5.0 7.0
11 12.0 10.0 0.0
12 0.0 9.0 3.0
Row Index Name: account
Column Index Name: device

The index names are updated to ones that made more sense in the context of the pivoted DataFrame.

Name Removed

The final operation we can perform on index names is to remove them. This also uses the rename_axis function, but sets the new names to None.

    laptop  phone   tv
10 0.0 5.0 7.0
11 12.0 10.0 0.0
12 0.0 9.0 3.0
Row Index Name: None
Column Index Name: None

Conclusion

Once you know how to work with index names they are easy to apply or remove. The meaning of the data in DataFrames with index names is more clear. This leads to better data quality and more accurate insights from your data.

--

--

Eric Ness
When I Work Data

Principal Machine Learning Engineer at C.H.Robinson, a Fortune 250 supply chain company.