Topping the Kaggle leaderboard while commuting home

Orion Talmi
@FireflyAI
Published in
3 min readMay 13, 2018

In Firefly I lead the algorithm development for our AutoML platform. I think we are coming up with an intelligent approach, but I’m biased. So, from time to time, we have our software compete in real challenges posed to the machine learning community.

I tried our AutoML platform on Kaggle’s Santander Customer Satisfaction challenge. Santander Bank asked for Kaggle competitors to predict dissatisfied customers. It was one of the most popular competitions in the last few years with a leaderboard of 5,123 teams.

We wanted our platform to prove its value, not to prove we are great data scientists. This dictates that we do minimal, if any, manual work. The fact I faced a pile of plenty of my mainstay work only strengthened this notion.

So for this competition — I decided to prepare the data for the system before going home, let the system sweat in the night and submit the predictions in the morning. Having a two-hour commute home added to my motivation to do it as quickly as possible.

So I downloaded the train data — 76,000 samples with 370 features, all anonymized — which I fed into Firefly Lab which is responsible for finding and training the best possible model.

That took me about five minutes. Another five minutes were spent by the system automatically analyzing the dataset. Reviewing the analysis results would have taken an entire work week since 370 features are too many, so I put my faith in the system.

I then spent the last five minutes of my work day setting up parameters for the model’s search. I could have used the default values but, after all, it was a competition. So I spent a couple of minutes tinkering with the UI’s advanced settings to make sure the software would have free reign — I increased the time budgeted to overnight (12 hours), allowed up to 500 models to be trained and capped the ensemble to 100 models.

Then I clicked the ‘Run’ button and left it to Firefly Lab to automatically search for the best parameters and predictors. I, on the other hand, had a relaxing walk to the train station.

The next morning, I noticed that my advanced settings were somewhat exaggerated. I could have stopped it after half an hour and gotten almost the same results. Instead, without me around, it kept on going and slowly improved the results.

Altogether 312 models were trained. From these models an ensemble was constructed to further increase prediction power. In this case the golden ensemble was comprised of four Random Forest models and a single Gradient Boosting model.

Looking more closely I could see the details of the models that Firefly Lab had chosen — whether used automatic data cleaning, which imputation method worked best, which engineered features were added and whether feature selection was necessary.

I then spent five minutes on the Kaggle submission — uploading test data, downloading predictions, creating the submission file in the correct format and uploading it to the Kaggle website.

The results surprised me, and I realized I’m two years too late.

The AUC score of 83.418% exceeded the Kaggle winner… by the same point spread that the winner outdistanced the 2,734th entry.

It took me 20 minutes altogether. On the first try.

--

--