Ned HIncorporating Heterogeneous Features with NLP using Column-TransformerNatural Language Processing using scikit-learn’s CountVectorizer or libraries such as spaCY and Gensim can provide powerful insights into…Jul 26, 2019Jul 26, 2019
Ned HWater Mapping and Supply in TanzaniaLambda School has given each of the 3 data science cohorts a kaggle competition involving water pumps in Tanzania. The data comes from…May 24, 2019May 24, 2019
Ned HDatetime Objects in Pandas — Strengths and LimitationsIn my last post about movie budgets, one of the key feature categeories was the Release Dates of films. Dates are an interesting…Apr 25, 2019Apr 25, 2019
Ned HScraping Hollywood — Beautiful Soup and Movie DataIn my last post about Hollywood movies, I didn’t specifically address the engineering challenge of scraping all the data for that analysis…Apr 21, 2019Apr 21, 2019
Ned HWhat if it’s not a NaN?Dealing with non-NaN strings that represent NaN values.Apr 11, 2019Apr 11, 2019
Ned HinWeaving DataBasic Data Cleaning — Removing NaNsAs a beginning data scientist, I’m learning that most of my time is spent preparing data for analysis. Much as writing is about clarifying…Mar 26, 20191Mar 26, 20191