Member-only story
What’s new in Pandas 2.0?
The five things to know about the big release
Pandas 2.0 hit general availability on April 3, 2023. Let’s see what features are hotter than a Corgi in the sunshine. ☀️
Three years ago I wrote What’s new in Pandas 1.0. One pandemic and a bunch of AI advances later, here we are with pandas 2.0.
Pandas is the standard, brain-friendly Python library for working with data. The 2.0 update is all about making pandas faster and more memory efficient. Memory is the number one reason people need to leave pandas for Dask, Ray, SQL databases, Spark DataFrames, and other tools. The more you can reduce memory use while working in pandas, the easier life is. 🙂
As you might expect with a major release version, pandas 2.0 has a number of significant changes. Let’s dig in!
pyarrow 🐍➡️
If there’s one word to sum up this release it’s pyarrow.
Pandas was built using NumPy data structures for memory management. Now you have the option to us use pyarrow as your backing memory format.
Using pyarrow means you a speed up and makes for more memory-efficient operations, because you can take advantage of the C++ implementation of Arrow. Fun fact, the creator of pandas, Wes McKinney went on to work on Arrow in…

