Dimitri LindeinThe StartupParsing Nonlinear Relationships and Deriving Features With Titanic DataBackgroundMay 19, 2020May 19, 2020
Dimitri LindePredicting Instacart Customer Purchases and Analyzing The OutputThis post is the third of three in my series about Instacart’s recently released “Online Grocery Shopping Dataset 2017.” In the first post…Aug 9, 2017Aug 9, 2017
Dimitri LindeGenerating Features to Predict Instacart Customer PurchasesIn my last post, I explored Instacart’s recently released dataset of 3.2 million customer orders. Instacart was in part motivated to…Aug 1, 2017Aug 1, 2017
Dimitri LindeExploring “The Instacart Online Grocery Shopping Dataset 2017”In early May, the grocery delivery service Instacart released a dataset containing between 4 and 100 orders across 206,000 users — 3.2…Jul 15, 20172Jul 15, 20172
Dimitri LindeScraping and Classifying Indeed Job Postings for Data Occupations, Part 2In the companion piece to this post, I discussed how to scrape the job aggregation site Indeed to compile a DataFrame of job descriptions…Jul 6, 2017Jul 6, 2017
Dimitri LindeScraping and Classifying Indeed Job Postings for Data Occupations, Part 1Compiling a dataset of job postings, structuring the text, and then classifying the postings by income-level and occupation-type requires…Jul 5, 2017Jul 5, 2017
Dimitri LindePrincipal Component Analysis with Lasso Regression on Kaggle’s Ames Housing DatasetKaggle’s Ames Housing dataset is modestly sized; there are 1460 rows and 81 columns in the training set, which I first encountered to…Jun 6, 2017Jun 6, 2017
Dimitri LindeAmateur Investigations of The Kaggle Chicago West Nile Virus DatasetWest Nile Virus, far from its discovery in Uganda, has recently become a recurring problem for the unfortunate people of Chicago. The…May 19, 2017May 19, 2017
Dimitri LindeA Tribute to rob cullitonPart Test, Part Tribute. 70/30. Mostly a test, if we’re being honest.May 1, 2017May 1, 2017