Member-only story
Using Pandas pipe function to improve code readability
An intuitive tutorial for the best practice with Pandas pipe()
In Data Processing, it is often necessary to write a function to perform operations (such as statistical calculations, splitting, or substituting value) on a certain row or column to obtain new data.
Instead of writing
# f(), g(), and h() are user-defined function
# df is a Pandas DataFramef(g(h(df), arg1=a), arg2=b, arg3=c)
We can write
(df.pipe(h)
.pipe(g, arg1=a)
.pipe(f, arg2=b, arg3=c)
)
Pandas introduced pipe()
starting from version 0.16.2. pipe()
enables user-defined methods in method chains.
Method chaining is a programmatic style of invoking multiple method calls sequentially with each call performing an action on the same object and returning it.
It eliminates the cognitive burden of naming variables at each intermediate step. Fluent Interface, a method of creating object-oriented API relies on method cascading (aka…