Ten ways to access Pandas DataFrame, which one is the best?

Andrew Zhu (Shudong Zhu)
CodeX
Published in
4 min readMar 28, 2021

--

To select data from pandas’ dataframe, we can use df_data['column'], and can also use df_data.column, then, df_data.loc['column'], yeah, can also use df_data.iloc['index']. Next, pd.eval(), and don't forget df_data.query().If the above is not enough, there is a package called numexpr, and many more.

The Zen of Python said:

There should be one — and preferably only one — obvious way to do it.

Hey Pandas Dataframe, is there one best and obvious way to select data? let’s go through 10 ways one by one and see if we can find the answer.

Say, we have a sample pd data:

import pandas as pd
pd_data = pd.DataFrame(
{'employee': ['Bob', 'Jake', 'Lisa', 'Sue', 'Andrew'],
'group': ['Accounting', 'Engineering', 'Engineering', 'HR','Engineering']}
)

If print the pd_data out, the result looks like this:

employee        group
0 Bob Accounting
1 Jake Engineering
2 Lisa Engineering
3 Sue HR
4 Andrew Engineering

Dictionary-style and Attribute-style

It is intuitive to use the dictionary-style to fetch column data:

pd_data['employee']

--

--