Member-only story
Forgetful pandas š¼
Pandas memory reports can mislead ā learn how to see the real usage and save memory!
The pandas library is THE tool for data cleaning, data prep, and data analysis in Python. Once you find your way around itās expansive API, itās a joy to work with. š
Pandas stores your data in memory, which makes operations zippy! š The downside is that a large dataset might not fit into your machineās memory, grinding your work to a halt. ā¹ļø
Often, itās handy to know how much memory your pandas DataFrame occupies. Iāve been working with pandas and teaching it to students for several years. I even wrote a little book on getting started with it. I had heard rumors that the memory usage shown by default wasnāt always accurate. š¤ But thereās so many data things to explore and so little time, that I only recently got around to investigating. I was shocked to learn how huge the memory undercount can be. š²
In this article Iāll show you a few ways to get the true memory usage for your pandas objects. Then Iāll share eight solutions for when your data wonāt fit into memory. š