Here at Kaggle we’re excited to showcase the work of our Grandmasters. This post was written by Vladimir Iglovikov, and is filled with advice that he wishes someone had shared when he was active on Kaggle. The original post can be found on Vlad’s Ternaus Blog.

Image for post
Image for post

Introduction

I participated in machine learning (ML) competitions at Kaggle and other platforms to build machine learning muscles. I was 19th in the global rating, got Kaggle Grandmaster title.

Every ML challenge ended with new knowledge, code, and model weights.

I loved new learnings but ignored the value that old ML pipelines could bring. Code stayed in private GitHub repositories. Weights were scattered all over the hard drive. …


The entire dataset of 1.7M+ arXiv papers is now available for free on Kaggle

Image for post
Image for post
Photo by Glenn Carstens-Peters on Unsplash

For nearly 30 years, arXiv has served the public and research communities by providing open access to scholarly articles, from the vast branches of physics to the many subdisciplines of computer science to everything in between, including math, statistics, electrical engineering, quantitative biology, and economics.

The sheer number of arXiv research papers is both beneficial and challenging. Whether it’s a graduate student ramping up in her respective field, an established professor delving into adjacent ones, or researchers searching for big picture insights for the public good, this rich corpus of information offers significant, but sometimes overwhelming, depth.

To help make the arXiv more accessible, we present a free, open pipeline on Kaggle to the machine-readable arXiv dataset: a repository of 1.7 million articles, with relevant features such as article titles, authors, categories, abstracts, full text PDFs, and more. …


How one Kaggler took top marks across multiple Covid-related challenges.

Image for post
Image for post
Photo by Markus Spiske on Unsplash

Today we interview Daniel, whose notebooks earned him top marks in Kaggle’s CORD-19 challenges. Kaggle hosted multiple challenges that worked with the Kaggle CORD-19 dataset, and Daniel won 1st place three times, including by a huge margin in the TREC-COVID challenge. (He had a score of 0.9, 2nd place overall had a score of 0.75, and 2nd place on Kaggle had a score of 0.6.)

Let’s meet Daniel!

Daniel, tell us a bit about yourself.

Daniel: I’m Daniel Wolffram, a graduate student in mathematics and a data science student assistant at Karlsruhe Institute of Technology (KIT), in Germany. …


Head of Data Analytics at Kaggle, Wendy Kan: This week, I had the pleasure of interviewing Parul Pandey, a Kaggle Notebooks Master, Data Science Evangelist at H2O.ai, and mother to a 5-year-old son. Parul shared the story of how she pivoted her career during maternity leave 5 years ago, and her experience in organizing Women in Data Science and Machine Learning groups.

As a fellow mother and data scientist, I found her story inspiring and wanted to spread the love and inspiration with the broader Kaggle community this Mother’s Day.

Image for post
Image for post
Parul and her son, Agrim

Wendy: What can you tell us about your academic and professional background? Did you code in school or at your previous job?

Parul: Academically, I’m an electrical engineer. I did quite a bit of coding as a part of my curriculum but also enjoyed it as a hobby during my undergraduate years. Professionally, I worked in the power distribution industry as a technical analyst before getting immersed in Data Science. My job entailed less coding and more of using tools, but I would still dabble in little programming activities as a side project during my free time. …


Kaggler, deoxy takes first-place and sets the stage for his next competition.

Please join us in congratulating Linsho Kaku (aka deoxy) on his solo first-place win in our Bengali.AI Handwritten Grapheme Classification challenge! Read the winning solution here: 1st Place Solution with Code

Image for post
Image for post
Random Ink by Sankarshan Mukhopadhyay @Flickr

Let’s meet Linsho!

Linsho, what would you like to share about yourself?

Linsho: I am a student in the Rio Yokota Laboratory at the Tokyo Institute of Technology. The main theme of the lab is high performance computing with advanced architectures including GPUs. We also deal with deep learning as one of its applications. I’m also an intern at Future Inc. working on an OCR task.

Did you have any prior experience or domain knowledge that helped you succeed in this competition?

The experience of working on OCR tasks as an intern was a big advantage for me. The ease in which I was able to pre-process data and create models was thanks to my intern experience. I’ve never specialized in Few-Shot Learning, which has been a major factor in the scores of the top teams this time around. However, I think my knowledge of the paper, which was shared in the lab, made a big difference. …


Join us in congratulating Sanghoon Kim aka Limerobot on his third place finish in Booz Allen Hamilton’s 2019 Data Science Bowl.

Image for post
Image for post

The Data Science Bowl, presented by Booz Allen Hamilton and Kaggle, is the world’s largest data science competition focused on social good. Each year, this competition gives data scientists a chance to use their passion to change the world. Over the last four years, more than 50,000+ competitors have submitted over 114,000+ submissions, to improve everything from lung cancer and heart disease detection to ocean health. This year, competitors were challenged to identify the factors that matter most to predicting player capability in an educational kid’s game by PBS. …


First place foursome, ‘Bibimorph’ share their winning approach to the QUEST Q&A Labeling competition by Google, and more!

Image for post
Image for post
Photo by Anna Vander Stel on Unsplash

Congratulations to the (four!) first-place winners of the Quest Q&A Labeling competition, Dmitriy Danevskiy, Yury Kashnitsky, Oleg Yaroshevskiy, and Dmitry Abulkhanov who make up the team “Bibimorph”!

In the QUEST Q&A Labeling competition by Google, participants were challenged to build predictive algorithms for different subjective aspects of question-answering. The provided dataset contained several thousand question-answer pairs, mostly from StackExchange. These pairs were human-labeled to reflect whether the question was well-written, whether the answer was relevant, helpful, satisfactory, contained clear instructions, etc. Results from the competition will hopefully foster the development of Q&A systems, contributing to them becoming more human-like. …


Congratulations to the winningest duo of the 2019 Data Science Bowl, ‘Zr’, and Ouyang Xuan (Shawn), who took first place and split $100,000!

Image for post
Image for post

The Data Science Bowl, presented by Booz Allen Hamilton and Kaggle, is the world’s largest data science competition focused on social good. Each year, this competition gives data scientists a chance to use their passion to change the world. Over the last four years, more than 50,000+ competitors have submitted over 114,000+ submissions, to improve everything from lung cancer and heart disease detection to ocean health. This year, competitors were challenged to identify the factors that matter most to predicting player capability in an educational kid’s game by PBS. …


In our first winner’s interview of 2020, we’d like to congratulate The Zoo on their first place win in the NFL Big Data Bowl competition! Please enjoy this joint Q&A between top competitors and teammates, Dmitry and Philipp.

Have follow up questions for them? Leave a comment below.

Let’s start with an introduction. Who is ‘The Zoo’?

We are both The Zoo! Dmitry Gordeev (dott)👋 and Philipp Singer (Psi)👋

We started competing together on Kaggle about a year ago. Back then, we had just started working at the same company in Austria as data scientists. We got to know each other outside of work and realized we had the same desire to get more practice with real-world machine learning tasks. We started doing a Friday afternoon hackathon, and one day decided to try a Kaggle competition just to see how it goes. …

About

Kaggle Team

Official authors of Kaggle winner’s interviews + more! Kaggle is the world’s largest community of data scientists. Join us at kaggle.com.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store