[Week 11] .iloc VS .loc VS .ix

After Ben wrote a code to graph for me, I had no idea what the syntax meant so I started googling one by one.

.loc | .iloc | .ix = are very important when it comes to extracting rows or columns from your data frame (df) ← what stackoverflow people tend to use

.loc

df.loc[row, col] is label based.

I prefer label based because I tend to know what my headings are and I can just pass the string into the .loc[‘str’]

Eg. df.loc[:, [‘Cat_Names’, ‘Dog_Names’]] → extract all rows and only columns with Cat_Names and Dog_Names

Eg. df.loc[[0:2], :] → loc includes both 0 and 2 so [0:2] = 0, 1, 2

You can even use Boolean condition

Eg. df.loc[df.Cat_Names == ‘Lucy’, :] → extract row that contains Lucy from column ‘Cat_Names’ and output all the information, hence (:) syntax

.iloc

df.iloc[row, col] is integer position based.

I don’t really like to use this because I have 7000 rows and god knows how many columns I have and I ain’t got all day to count their index. Python also starts indexing from 0

Eg. df.iloc[:, 0:4] → extract all rows and only first column to third one.

.iloc excludes the second number but includes first number such as [0:4] will output → 0, 1, 2, 3

.ix

df.ix[row, col] takes both integer and label based.

This is pretty good and flexible when I know ‘some’ of the index’ and can always mix it with label