How to win a Kaggle competitions

Julia Rubtsova, PhD
Product AI
Published in
3 min readAug 17, 2021

Kaggle is one of the most popular platforms for data scientists to hone their skills, forge great reputations, and possibly get paid. However, becoming successful on Kaggle isn’t so easy. It takes patience, hard work, and constant practice. The brightest minds in data science are brought together on this platform, so the competition is fierce. Therefore, in order to succeed on this site, we recommend that you act according to the following algorithm:

- Read the competition guide carefully. Description of the competition, timing, evaluation criteria, and compliance with the rules are all vital. By carefully studying the guide, you will also learn other frequently-overlooked details such as the appropriate application format and guidelines for replicating reference standards. Be sure to take your time before getting started.

- Understand the performance metric. How the performance metric works is the yardstick by which your performance will be measured, and you need to know it thoroughly.

- The third step is a detailed study of the data. You start with an exploratory data analysis to find missing and null values ​​and hidden patterns in the dataset. The more you know about data, the better models you can build from it to improve your results. See what weaknesses in the data you can take advantage of.

- Create your own local environment for testing. By doing this, you will be able to move faster. This will allow you to get reliable results rather than depending solely on the leaderboard metrics. By reducing the number of applications submitted, you also significantly reduce the likelihood of busting on the leaderboard, and this will save you from poor results during the evaluation phase.

- Read the forums. Forums and online discussions are your allies. The forum will help you keep up-to-date with what’s going on in the competition. Even if you don’t win, you can keep trying and learning from the results following the competitions available on the forum to see where you went wrong or what your peers did to surpass your genius. It’s a great way to learn from the best and constantly improve.

- Perform exhaustive research. Explore official company blogs and extensive published work or patents that may come in handy. Even if you don’t win the first few times, you will learn, hone your skills, and become the best data science expert you can be.

- Use proven solutions. The typical, simplest algorithms that you might ignore could prove to be a huge advantage. When experimenting with methods, it’s advisable to manually adjust or change the basic parameters. Experienced Kagglers admit that one of the winning habits is manual tuning.

- The most important thing — make an ensemble of models. In most major events, different teams usually combine their models to improve their scores. Since no competition on Kaggle has ever been won by a single model, it makes sense to combine different independent models, even if you are competing alone.

- Only work on one project.

- Choose the right approach. In the whole history of Kaggle, there are only two winning approaches that consistently appear in all competitions. “Feature Engineering” and “Neural / Deep Learning Networks”. Feature engineering is the best approach if you understand the data. The second winning approach on Kaggle is neural networks and deep learning. If you’re dealing with a dataset containing speech problems and image-rich content, deep learning is the way to go.

Believe in yourself and take the time to learn as much as you can. Don’t dismiss any information. For all data scientists looking to master machine learning algorithms, Kaggle is the ultimate platform to build experience and hone skills.

Original article written by Rinat S.

https://medium.com/@rinats

--

--

Julia Rubtsova, PhD
Product AI

Human-Computer Systems, Natural Language Processing, Knowledge Engineering, Machine learning, Data analysis, Data mining