Pandas Notes

Reading a file :

import pandas as pd
if csv:
df = pd.read_csv(file_path)
if excel :

Writing to a file

if csv:
df.to_csv(file_opath, index=False)
index  = False - it wont print line numbers to csv file
if excel:

Querying a dataframe

df = df[(df.TRU == “Rural”) | (df.TRU == “Urban”)]

Adding a column


df_f[‘Sex’]=’Female’ # Sex column is added filled with Female

Renaming an existing column:

df_f.rename(columns={‘TOT_F’:’Total’}, inplace=True)
TOT_F replaced by Total

Picking columns of interest

interested_columns=[‘State’, ‘District’, ‘TRU’, ‘Name’, ‘Religion’, ‘TOT_M’, ‘TOT_F’]

Display rows

top 10 rows:
bottom 10 rows:

Applying functions on Dataframe

def func(ds):
if ds.Name.startswith(‘district’) == 0:
ds[‘geo_code’] = ds.State # this creates a column ‘geo_code’ and assigns the data
ds[‘geo_code’] = ds.District

if ds.Name == ‘india’:
ds[‘geo_level’] = ‘country’
ds[‘geo_code’] = ‘IN’
elif ds.Name.startswith(‘district’):
ds[‘geo_level’] = ‘district’
elif ds.Name.startswith(‘state’):
ds[‘geo_level’] = ‘state’
return ds
df=df.apply(func, axis=1)
if axis is 1 it is on column , if 0 on row

Concatenating the tables(dataframe)

df_m # df male data
df_f # df female data
result = pd.concat([df_m, df_f])

Removing duplicate files


Applying string function on column:

Upper case to lower case

# upper case to lower case
One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.