Civis R&D Bookshelf: How not to select subsets of data in pandas, text to speech using neural nets, and data viz Makeover Monday

by Peter Skipper

Civis Analytics

Published in

The Civis Journal

2 min readJan 5, 2018

Pandas: A How NOT To Primer

There are a plethora of tutorials on how to use pandas, one of the most popular Python tools for data science. However, the article above is one of the few I’ve come across that specifically references ANTI-patterns, and helps you write cleaner code and avoid mistakes. This approachable piece from Ted Petrou explains how to avoid slicing/indexing mistakes in pandas, and elaborates on the reasoning behind a common warning in pandas that many users will probably be familiar with, the SettingWithCopyWarning. It’s a great way to learn more idiomatic pandas.

Text to Speech with Neural Nets

Google Research has released a new paper on generating speech from text using more natural cadence and emphasis. The result is computer-generated audio that sounds much less stilted, and much more like a professional recording. It’s amazing to see how quickly computers are learning to sound less like Siri, and without a lot of linguistic feature engineering. Check out some audio samples here, where you can even listen to human and computer versions of the same text and try to distinguish (it’s pretty tough!). If you’re curious about the implementation, the short paper here provides details.

Level Up your Data Viz with Makeover Monday

Makeover Monday is a community-driven data visualization project. Each week, the founders choose a data visualization released somewhere on the web, and encourage participants to re-work it in order to tell the story more effectively. They strive to make interesting changes without insulting the original authors, and the result is an amazing series of digestible tips on making your visualizations more understandable and accessible. You can check out one of my favorite pieces here. And let us know on Twitter if you decide to participate!

This post is part of our Bookshelf series organized by the Data Science R&D department at Civis Analytics. In this series, Civis data scientists share links to interesting software tools, blog posts, scientific articles, and other things that they have read about recently, along with a little commentary about why these things are worth checking out. Are you reading anything interesting? We’d love to hear from you on Twitter.

Civis R&D Bookshelf: How not to select subsets of data in pandas, text to speech using neural nets, and data viz Makeover Monday

by Peter Skipper

Written by Civis Analytics