What the hell am I supposed to do when public scripts are beating the model I spent <insert time> hours/days on?

Laurae: This post is about why you should not get frustrated when people using point-and-click scripts are beating your models you worked hard on. The post can be found at Kaggle. It includes Selfish Gene ‘s post. The paragraph before that quoted post was included in Medium.

Advice if you are there mainly for the competition: set the best public script as your benchmark and beat it :) it is not because a public script is better than you that you can’t do better: some people are higher than these scripts and if they did it, you can do it too with your own techniques!

If you have already beaten it, there is no one to prevent you from doing original combinations in order to go to the top.

Scripts are interesting in the value they bring for people who can learn from them properly. Those who are baited by the high ranked scripts, not understanding how they work, and expect at an interview to answer to “how did you rank this high at Kaggle? explain me how ABC works? how did you engineer XYZ? what was your process?” they will get ripped during that interview because they won’t remember how they ranked this high (the script content and the whole process to make it from zero) :)

There is a very large thread about scripts here @CrowdFlower competition

Don’t get it wrong but there are also people submitting scripts in order to “reserve” their place to pass the first deadline (faster than submitting sample_submission.csv, and you can’t participate if you don’t pass the 1st deadline). Scripts are also very good to test original things.

Also: imagine you tried something that looked super cool, etc., and get only #3000… when you can read a script, learn from it, and add on top of it your own stuff to score even higher! And if it fails… be brave, scrap everything (keep it somewhere else), and restart from the script to add new things!

Quoting Seflish Gene:

Selfish Gene wrote:
Someone figures something out, shares his result with others and everybody wins!

Full Selfish Gene ‘s post if you are interested, which is really nice to read as it sums up what a public script (yes, PUBLIC — means you can copy & paste, then work on it better than your competitors if you intend to — until you overfit badly and get crushed by 3000+ people, even those who went “AFK” for months and did not come back for the end of the competition) it supposed to do:

Selfish Gene wrote:
I don’t really understand this talk about “plagiarism” and “ethics” and the general negativity in reference to scripts.
in what way is running someone else’s script different from running XGB? did we write XGB? or how is it different from us using boosting? have we invented the concept of boosting? or how is it different from using numpy or sklearn? … or using python? … or using Windows or Linux? or using a computer altogether? or using electricity? or being located in a building with a roof and plumbing? or buying food at the grocery store?
Not only we are not growing our own food, we have even forgotten a little bit that someone sometime had to invent the concept of agriculture and that they struggled quite a bit before they could make it work.
And this is not a bad thing! This is how technological and scientific advancement works!
Someone figures something out, shares his result with others and everybody wins!
One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.