Data exploration. You should have pandas functions like .corr(), scatter_matrix() , .hist() and .bar() on the tip of your tongue. You should always be looking for opportunities to visualize your data using PCA or t-SNE, using sklearn's PCA and TSNE functions.