In-The-Money Kaggle Gold — What It Brought Me?

Just a realization; nothing towards career

James Koh, PhD
MITB For All
5 min readOct 12, 2023

--

A few years ago, I won a Kaggle gold medal in a competition, predicting the readability score of texts.

In fact, at 6th position out of over 3,633, it was in the money, with a cash prize of $5,000 shared among the team.

Screenshots from my Kaggle account

I never bothered to write it on my LinkedIn or resume, because I don’t see it as useful towards a career advance. Let me share 3 reasons why, before I go on to talk about my realization.

Reason #1 — It’s really about doing an ensemble of the top models

Face it. The vast majority of Kaggle competitors use other published notebooks, and then build ensembles upon it.

Most state-of-the-art models are already good enough, and it is highly unlikely that there exists a magical unknown model that only you know of. (If so, you can publish an academic paper, and that journal/conference would probably bring you lots of citations!)

This is evident from the fact that all competitors in the leaderboard are very near to one another. You can see that the difference between the grand champion and recipients of the silver medal is just a mere 1%.

17th position is the last gold, and 181th position is the last silver — the difference between these are less than 2% in the metric. And between the 1st position and 254th position (a bronze medal) is a mere 3%.

Leadership board (From 1st to 254th position). Screenshot by author.

Reason #2 — It doesn’t go beyond model building

There’s no formulation of the business problem; you just use what’s been given to everyone and try to get that extra 0.001 on whatever metric you are scored on.

Apart from the limit on execution time, you do not learn to deal with the constraints that you would in the real world — things like meeting timelines, dealing with changes to the user’s priorities, managing your boss’ expectation, deploying it and presenting/‘selling’ your work to people who do not know any details within.

These skills, which are vital towards career success, are not developed in a Kaggle competition.

Reason #3 — The objectives are very different in ‘real life’

The time spent on trying to get that extra 0.001 is not how it works in (the vast majority of) business environments. No team would spend a full month just to push it up a notch.

My take on what’s out there. Image by author.

If you take a step back to think about it, countless thousands of man-hours are burnt as competitors fight to stay on the leaderboard. It is a zero-sum game, in that one’s rise must come at the expense of another’s fall.

My Realization

That was my last Kaggle competition. I figured, if “that’s it” with a gold medal, then it’s time to pull the plug.

I came to the conclusion that my time could be much better spent elsewhere, doing things that actually bring progress to my career.

You could be thinking, ‘these words are easy to say for someone who already has a foot in the AI door’. Well, refer to points #2 and #3 above again, and ask yourself if the hiring manager cares about your gold medal.

Of course, you could say that you join a competition and just try out the models out there, then stop. However, due to my competitive nature, I tend to get caught up in the leadership board, and realize it is an ineffective use of my time.

If you are:

  • (like me back then) doing it to boost your resume and demonstrate your competence, do think about the points I raised above.
  • seeking to learn new things, think of how much time is really worth investing, and don’t get lost in the process.
  • joining Kaggle competitions for the chance to win some cash, you are probably better off working as a part-time waiter or cashier or delivery person. (again, no offence to these jobs, these are honest living)

Clarifications

Now, don’t get me wrong. No one is undermining people who are into Kaggle competitions, much less those who have won the hard-earned medals. It is just that, there is so much more, when you look at the entire Data Science universe and put it in context.

I’ve learnt some useful things in the process. I’m sure you will too. You will learn to appreciate the fact that CV and NLP works based on having good feature extractors (and in the process learn the SOTA models to use) through this competition. You will learn to use intermediate feature vectors, combine them, and apply a whole variety of supervised learning techniques (xgboost, random forest, and more). But if you were to invest hundreds of hours, what do you think the marginal returns would be?

My point is, there comes a time where we need to put it down and move on, because there’s way too much to learn out there. Whether this happens sooner or later is up to you.

Disclaimer: All opinions and interpretations are that of the writer, and not of MITB. I declare that I have full rights to use the contents published here, and nothing is plagiarized. I declare that this article is written by me and not with any generative AI tool such as ChatGPT. I declare that no data privacy policy is breached, and that any data associated with the contents here are obtained legitimately to the best of my knowledge. I agree not to make any changes without first seeking the editors’ approval. Any violations may lead to this article being retracted from the publication.

--

--

James Koh, PhD
MITB For All

Data Science Instructor - teaching Masters students