Case Study: How Your Course Can Incorporate the Reproducibility Challenge

Published in

PapersWithCode

4 min readApr 29, 2021

The Machine Learning Reproducibility Challenge (MLRC) is an event hosted by Papers with Code designed to encourage the publishing and sharing of reproducible scientific results in machine learning (ML). It is therefore an excellent opportunity to expose students to ML research, and where the accepted papers are published in a special edition of the ReScience journal.

The University of Amsterdam incorporated the MLRC into a graduate level course for students in the Master AI study program. The project component of the course was based on MLRC participation, and this was the only deliverable for the course.

Below we cover our learnings from incorporating the MLRC into a graduate-level course. Specifically we outline the motivations, how it was achieved, and guidelines for other programmes that may also want to incorporate the MLRC into their study programs.

Motivation

The Fairness, Accountability, Confidentiality and Transparency in Artificial Intelligence course is part of a two year technical research-focused Masters programme. Given that reproducing existing research is often one of the first steps in ML research, participating in the MLRC is an excellent opportunity to give students the chance to experience the entire research pipeline:

Reading a technical paper
Understanding a paper’s strength and weaknesses
Implementing (and maybe extending) the paper
Writing up the findings
Submitting to a venue with a deadline
Obtaining feedback
Writing a rebuttal
Receiving the official notification

Preparing your course

Before your MLRC-supported course starts, we recommend:

Choose 10-15 papers from the MLRC OpenReview portal that are suitable for your course. Key requirements:

At least one dataset in the paper is publicly available.
Experiments can be run on a single GPU (which we were able to provide access to).
Paper is relevant to the topics covered in the course.
It is reasonable to reimplement the paper within the allotted time.

2. Hire a team of experienced, graduate-level TAs to guide students through the MLRC project. Have multiple groups work on the same paper (to reduce the teaching load for TAs), and have each TA supervise 2–3 papers each.

3. Assign papers to TAs based on their interests (i.e., ask them to rank the set of candidate papers in advance). Allow them to suggest alternative papers (provided they fit the requirements in Step 1). Have TAs read their papers in-depth in advance of the course and familiarize themselves with the corresponding public code base, if available.

4. Write a document outlining the course and the MLRC project.

5. Provide the grading scheme in advance.

Running your course

During the course we recommend:

Ask students to form groups of 3–4 and indicate paper preferences through a Google form. Each group works on one paper. We did our best to assign papers to groups based on these preferences as we believe implementing a paper students are genuinely interested in will lead to better results. Each group was allocated one GPU for the duration of the course to run experiments on.
Include paper reading sessions as part of the course, where an instructor first walks students through a seminal ML paper, and then students discuss the paper together in smaller groups. Questions we encouraged students to think about include:

What are the main claims of the paper?
What are the research questions?
What are the answers to the research questions?
Does the experimental setup make sense, given the research questions?
Do the experimental results support the main claims of the paper and answer the research questions effectively?

Making your course successful

Our experience with the course highlighted a few principles for ensuring a successful outcome:

Ensure students don’t get stuck by having experienced TAs guide them throughout the entire project. We also had a dedicated Slack workspace for the TAs and course instructors to keep in touch regularly.
Prioritize the MLRC by tying the reproducibility report directly to the grading. Students were graded on the same report that they submitted to the MLRC so participating in the MLRC is not an extra task but rather an integral part of the course.
Require students to extend the paper if source code is already available. We found this resulted in creative and interesting ideas in the reports.
Require students to submit a draft report to TAs two weeks before the deadline. We found having this feedback significantly increased the quality of the final reports.
Grade reports independently of the reviews. This keeps the publication process as an additional learning experience, and reduces dependencies between course timelines and reviewing periods.
Reward students for formally submitting to the MLRC with bonus points on their final grades. We increased grades by 5%.

Beyond the course

After submission of the reports, we recommend:

Have TAs help students write rebuttals since this is a new experience for them.
Collect and process feedback from students to improve the next iteration of the course.
Encourage students to pursue research further if they enjoyed their projects!

We had two students approach the course instructors last year wanting to extend their project, which has now turned into a paper under review at a top conference.

These students became TAs this year, and they gave a talk at the beginning of the course about their experiences from the previous year. This was very insightful and motivating for our new class of students.

All in all, this was a great experience both for students and TAs, with 9 papers accepted at the MLRC. We are planning to continue incorporating the MLRC in future iterations of the course.

This is a guest post by Ana Lučić, University of Amsterdam

Acknowledgements: This course was coordinated together with Maarten de Rijke, Maurits Bleeker, and Sami Jullien. We want to thank Michael Neely, Stefan Schouten, Angelo de Groot, Christos Athanasiadis, Christina Winkler, Micha de Groot, and Maartje ter Hoeve for their help with the course. We also want to thank Ross Taylor for helping with the blog post.