Member-only story
DS Simplified
10 Pandas Functions I Regret Overlooking in My Data Science Journey
Unlocking the Hidden Gems for Efficient Data Manipulation
When I started with Pandas, I thought I knew enough- read_csv()
, groupby()
, merge()
, and the usual suspects. But over time, I realized I had been missing out on some incredibly powerful functions that could have saved me hours of debugging, unnecessary loops, and inefficient code. In this article, I’ll share 10 Pandas functions I regret overlooking in my initial days of data science learning journey.
1) .explode()
Expanding Lists into Rows
I always used to use apply()
with a for loop to split lists inside a column into separate rows. However, for large datasets, this approach becomes quite slow and inefficient.
.explode()
does this in one line, making operations on nested lists much faster and cleaner.
import pandas as pd
df = pd.DataFrame({'ID': [1, 2], 'Values': [[10, 20, 30], [40, 50]]})
df_exploded = df.explode('Values')
Tip: Use .reset_index(drop=True)
after .explode()
if you want a clean index.