Member-only story
Top 10 Categories of Pandas Functions That I Use Most
Get familiar with these functions to help you process data
People love to use Python because it has a versatile repository of third-party libraries for all kinds of work. For data science, one of the most popular libraries for data processing is Pandas. Over the years, because of its open-source nature, many developers have contributed to this project, making pandas powerful for almost any data processing job.
I didn’t count, but I felt like there were hundreds of functions that you can use with Pandas. Although I use maybe twenty or thirty functions frequently, it’s unrealistic to talk about them all. Thus, I’ll just focus on the 10 most useful categories of functions in this post. Once you get along with them well, they can probably address over 70% of your data processing needs.
1. Reading data
We usually read data from external sources. Depending on the format of the source data, we can use the corresponding read_*
functions.
read_csv
: use it when your source data is in the CSV format. Some notable arguments includeheader
(whether and which row is the header),sep
(the delimiter), andusecols
(a subset of columns to use).read_excel
: use it when your source data is in Excel…