[Week 11] .iloc VS .loc VS .ix
After Ben wrote a code to graph for me, I had no idea what the syntax meant so I started googling one by one.
.loc | .iloc | .ix = are very important when it comes to extracting rows or columns from your data frame (df) ← what stackoverflow people tend to use
df.loc[row, col] is label based.
I prefer label based because I tend to know what my headings are and I can just pass the string into the .loc[‘str’]
Eg. df.loc[:, [‘Cat_Names’, ‘Dog_Names’]] → extract all rows and only columns with Cat_Names and Dog_Names
Eg. df.loc[[0:2], :] → loc includes both 0 and 2 so [0:2] = 0, 1, 2
You can even use Boolean condition
Eg. df.loc[df.Cat_Names == ‘Lucy’, :] → extract row that contains Lucy from column ‘Cat_Names’ and output all the information, hence (:) syntax
df.iloc[row, col] is integer position based.
I don’t really like to use this because I have 7000 rows and god knows how many columns I have and I ain’t got all day to count their index. Python also starts indexing from 0
Eg. df.iloc[:, 0:4] → extract all rows and only first column to third one.
.iloc excludes the second number but includes first number such as [0:4] will output → 0, 1, 2, 3
df.ix[row, col] takes both integer and label based.
This is pretty good and flexible when I know ‘some’ of the index’ and can always mix it with label