Thank you for your interesting answer.
Mostapha Benhenda
11

I think a lot of the people in kaggle have done the suggested pipe line you suggested to some extend with variations. For example Andrew’s Course is almost a pre-requisite to participating in kaggle and many kagglers (including me) had done that prior to competing . Kaggle is kind of a ‘next step’.

Well, I would disagree that kaggle only focuses on model performance. This may have been the case in the past, now the ability is measured by participation in discussions as well as building ‘kernels’ and/or descriptive notebooks , sharing insights about the data and their ranking system is based on 3 tiers (model performance, contribution in discussions, contributions via scripts) . But to your point about focusing on model performance , you obviously cannot have everything for a platform that has 1M members. You need to find ways to standardize and measure/improve skills in a quantifiable way.

I have not tried Startcrowd before , but I will give it a shot. In principle I have tried many other predictive modelling platforms and I think kaggle so far has been the best for some good reasons. Most of the other platforms, trying to incorporate more qualitative elements into their competition/solving process (that could extend to how we create the problem or even generating the data and not just modelling on some finite features and a target variable) , suffers from objectivity. Somebody has to assess whether your approach is valid based on subjective criteria (as well as objective ones).

I think the hype or data science bubble you mention is NOT a kaggle problem , but a general one. Companies do not know how to use data science now and they think they should use data science even though the don’t know what it is , but this does not mean that you don’t get valuable lessons from kaggle. We will have to dig down about what is the definition of a data scientist, but knowing how use unsupervised an supervised methods to map features with an outcome , visualizing the data , having access to a community of data scientists from within the field (some more experienced that others) contributes massively to becoming a better data scientist. Knowing how to generate the right question to give value to the business is something you abstractly learn from kaggle too as many different companies try to do just that by uploading their data on kaggle.

“Why should I want my computer to differentiate between dogs and cats? Do pet owners suffer from this problem?”

In kaggle many innovations take place. Research has been trying for years to tackle this problem! Within kaggle (and for that specific problem) AI error dropped smaller than human error for differentiating the 2. Knowing the skills to differentiate between the 2 — as a data scientist — can help you with many other image classification tasks. Not all kaggle problems focused on solving specific business tasks, hence some were funded by universities.

You call the higgs bosson competition hosted on kaggle “re-inventing the wheel” , however the top models had been invited to conferences to present their findings (https://higgsml.lal.in2p3.fr/prizes-and-award/nips/) . There was also he first public appearance of Xgboost (https://github.com/dmlc/xgboost) , one of the best gradien-tboosting algorithms out there that was eventually invited to CERN 2015 as part of this (https://higgsml.lal.in2p3.fr/prizes-and-award/award/) to present it. Today is considered the state of the art and this competition was used to improve its performance. This is just a small example, the top solutions in that competition were all very innovative .

I agree that science cumulative. Competitions helps you to break the boundaries and see the theoretical best you could get for a finite specific problem given some input data based on the current state-of-the art techniques and approaches available out there. It adds a fun element to (as does collaboration ).

I have learnt massively from kaggle and has helped my career a lot. I will definitely come back to kaggle and I invite everybody else to do so! It would have been nice to see you there too.The more the merrier!

Show your support

Clapping shows how much you appreciated KazA nova’s story.