ETL Using Python and Pandas

I was working on a CRM deployment and needed to migrate data from the old system to the new one.

The 50k rows of dataset had fewer than a dozen columns and was straightforward by all means. File size was smaller than 10MB. Sadly, that was enough to choke Excel on a modern day ThinkPad with 20GB RAM.

Whipping up some Pandas script was simpler. This was a quick summary. The Jupyter (iPython) version is also available.

The sample data files are published on Github

Originally published at Kenneth Lo, PMP.