Sitemap
TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

What’s new in Pandas 2.0?

The five things to know about the big release

5 min readApr 10, 2023

--

Pandas 2.0 hit general availability on April 3, 2023. Let’s see what features are hotter than a Corgi in the sunshine. ☀️

Press enter or click to view image in full size
corgi dog with tongue out in the sun on grass
Source: pixabay.com

Three years ago I wrote What’s new in Pandas 1.0. One pandemic and a bunch of AI advances later, here we are with pandas 2.0.

Pandas is the standard, brain-friendly Python library for working with data. The 2.0 update is all about making pandas faster and more memory efficient. Memory is the number one reason people need to leave pandas for Dask, Ray, SQL databases, Spark DataFrames, and other tools. The more you can reduce memory use while working in pandas, the easier life is. 🙂

As you might expect with a major release version, pandas 2.0 has a number of significant changes. Let’s dig in!

pyarrow 🐍➡️

If there’s one word to sum up this release it’s pyarrow.

Pandas was built using NumPy data structures for memory management. Now you have the option to us use pyarrow as your backing memory format.

Using pyarrow means you a speed up and makes for more memory-efficient operations, because you can take advantage of the C++ implementation of Arrow. Fun fact, the creator of pandas, Wes McKinney went on to work on Arrow in…

--

--

TDS Archive
TDS Archive

Published in TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Jeff Hale
Jeff Hale

Written by Jeff Hale

I write about data things. Follow me on Medium and join my Data Awesome mailing list to stay on top of the latest data tools and tips: https://dataawesome.com

Responses (3)