My Python journey, Pt. 3: Seaborn, NLTK, and more

Marin Gow
Marin Gow
Apr 20, 2018 · 2 min read

This last month has been challenging in the Python world. My mentor and I spent a lot of time cleaning up my Enron emails dataset and getting it correctly entered into my Postgres database. I’m learning how intuitive Python can be, but also how critical familiarity with modules is if you want to accomplish anything.

My Python code grew from one file to five over the last couple of weeks. To a developer, this is probably not a big deal, but I’ve never created a project of my own with multiple files before. Each of my ORM classes has its own file, which means I import them in the other files that reference them.

Image for post
Image for post

And here is my cleaned up, functional code for reading the Enron email files and writing them to Postgres:

I’m now in the data exploration phase, where I have two primary goals. The first is to familiarize myself with data visualization in Python. The second is to use NLTK to try out natural language processing.

For the data visualization, I selected Seaborn, a Python module built on top of matplotlib to more easily produce pretty graphs. I also used Pandas to easily read data in from Postgres and store it in a dataframe. Still a work in progress, but here is the graph I made of the most frequent contacts by Kenneth Lay, former CEO of Enron:

Image for post
Image for post
Important question: Who is Mrslinda?

Here is the visualization code. I want to do a lot more with this, but I try to take advantage of my time with Jay to get a minimum viable product working, and then switch topics.

With NLKT, I want to start with sentiment analysis, and there’s a cool module called VADER that does a lot of the work for you. Ideally, I’d like to create a visualization of the relationships between the different Enron employees that shows which links were strongest (most emails exchanged) and what the tone of those emails was (positive vs. negative sentiment). For my final product, I’m envisioning a folder of analyses and visualizations that tell the Enron story through the emails.

As always, thanks to Jay for the time he spends every week teaching me everything from Pycharm debugging tips to the what in the world heaps do.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch

Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore

Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store