Having been former kaggle number #1, while being in the industry for many years in senior data…
KazA nova
91

Thank you for your interesting answer. The point of my article was precisely to invite Kagglers to run another extra mile, and enlarge their arsenal.

On Kaggle, we are pushed to run the extra mile in only one dimension: model performance. However, real-world problems are multi-dimensional: it also requires a lot of effort to define a problem, to find its value, and to explain this value to potential beneficiaries. On Kaggle, you won’t learn anything about these. I think Kagglers should learn to run the extra mile in those directions too. Look at the big picture. That’s what Startcrowd is trying to achieve.

Otherwise, we get the current situation: a lot of companies solving problems that do not exist. AI is a great tool, but finding useful purposes for this tool is not a trivial task. It requires a lot of training and practice. These companies are now well-funded because of the data science hype, they can lure investors to attract the money to hire Kagglers. However, when this speculative bubble explodes, the question will arise about the economic value of Kaggle-trained data scientists, especially when there are so many good APIs that can do the same job.

Your list of Kaggle problems is symptomatic: for example, why should I want my computer to differentiate between dogs and cats? Do pet owners suffer from this problem?

Likewise, the Higgs boson was not predicted on Kaggle, as far as I know. It would have been more productive and interesting if the huge Kaggle crowd built upon this discovery, saw what else could be done, and split the new tasks into a myriad of smaller sub-crowds. Can’t this huge crowd do more than just re-inventing the wheel?

Science is cumulative, not competitive. Adding up thousands of contributions is much more difficult than pitting them against each other. That’s where the real challenge is. In the current situation, each additional Kaggler brings a logarithmic improvement to the collective outcome. It is crowdsourcing performed the wrong way. This fact should blow to the face of any scientist, with or without PhD.

So no, I won’t come back to Kaggle.

I prefer Startcrowd.