Announcing the 2024 TMLR Outstanding Certification
By the 2024 TMLR Outstanding Paper Committee: Michael Bowling, Brian Kingsbury, Andreas Kirsch, Yingzhen Li, and Eleni Triantafillou
The 2024 TMLR Outstanding Paper Committee is pleased to award Outstanding Certifications to two papers this year:
- “Inversion by Direct Iteration: An Alternative to Denoising Diffusion for Image Restoration”
- “Holistic Evaluation of Language Models” (HELM)
This marks the first time the certification has been awarded to multiple papers, reflecting the exceptional quality and impact of both works.
“Inversion by Direct Iteration: An Alternative to Denoising Diffusion for Image Restoration,” by Mauricio Delbracio and Peyman Milanfar, introduces an elegant method for image restoration that effectively addresses the “regression to the mean” problem in supervised image restoration. Unlike existing denoising diffusion models, the proposed method does not require an analytic form for the degradation process, making it applicable to any restoration task with paired examples. The work provides deep theoretical connections with residual flow ODEs, denoising diffusion, and flow matching, while maintaining a remarkably simple formulation. The work has already inspired numerous follow-on papers and received strong endorsements from both expert reviewers and Action Editors for its theoretical foundations and empirical validation. Experts particularly noted the paper’s comprehensive investigation of the proposed method and its elegant theoretical connections with existing approaches.
“Holistic Evaluation of Language Models” (HELM), by a group of 50 authors including lead authors Percy Liang, Rishi Bommasani, and Tony Lee, establishes a comprehensive framework for evaluating language models that has become foundational. The paper stands out for its careful methodology in isolating conclusions and its thorough evaluation design across multiple dimensions of model performance. Expert reviewers highlighted that HELM was one of the first evaluation platforms in the era of large language models, setting important standards through its comprehensive approach to details and thoughtful consideration of evaluation metrics. The work’s impact extends beyond its immediate contributions, and the project has grown far beyond the scope of the original paper. The HELM team maintains an active project page featuring open-source release of several leaderboards, code, and evaluation products, enabling a wide range of future research. Notably, the authors thoughtfully acknowledged the platform’s limitations, which has helped guide subsequent research in addressing these gaps.
We note that, while institutional affiliation was not a selection criterion, the two winning papers represent work from both academic and industrial research teams. This reflects TMLR’s role as a venue that attracts excellent submissions from researchers across the broader machine learning community.
The Committee also recognizes three papers as Outstanding Paper Finalists:
- “Beyond the Imitation Game: Quantifying and Extrapolating the Capabilities of Language Models” (BIG-bench)
- “DINOv2: Learning Robust Visual Features without Supervision”
- “Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback”
These papers represent significant contributions across different areas of machine learning. BIG-bench was noted for its remarkable cross-institutional collaboration and novel task designs that highlight important areas for future model improvement. DINOv2 demonstrated substantial improvements in fine-grained visual understanding tasks and has become one of the most commonly used backbones in computer vision. The RLHF survey paper provides a comprehensive analysis of an increasingly important technology while highlighting crucial open problems and limitations. All of these works have been retroactively recognized with a Featured Certification.
Selection Process
The selection process for this year’s awards followed similar criteria to last year’s inaugural selection, while incorporating additional rigor in the evaluation process. Any papers published in TMLR up until May 3, 2024 were considered for this award, excluding those that were already considered for last year’s award. The initial pool of candidates was filtered based on two primary criteria: papers that received Featured Certifications from their Action Editors or papers that demonstrated significant citation impact (identified using bibliometric analysis).
Featured Certifications continue to serve as one of the most reliable indicators of exceptional work in TMLR, as these papers have been specifically highlighted by their Action Editors for their outstanding contributions. Citation metrics, while considered, were carefully weighted to account for various factors including publication timing and subfield-specific citation patterns.
For each candidate paper, the committee solicited detailed feedback from both the paper’s Action Editor and multiple domain experts. This expert feedback was crucial in evaluating several dimensions:
- Technical depth and novelty
- Broader impact on the field
- Reproducibility and open-source contributions
- Methodological rigor
- Potential for long-term influence in their respective fields
The committee conducted multiple rounds of review and discussion, with careful attention to managing potential conflicts of interest. Committee members with institutional affiliations to any candidate papers recused themselves from voting on those papers. Even with these recusals, the winning papers received standout support from non-conflicted committee members.
The papers that advanced to final consideration demonstrated not only technical excellence but also significant impact on their respective fields. The committee noted that several other strong candidates were considered, and the final selection represents papers that have already begun to shape how the community approaches important problems in machine learning.
We encourage the machine learning community to read both the winning papers and the finalist papers, as they represent exemplary work published in TMLR and demonstrate the journal’s commitment to publishing high-quality, impactful research across different areas of machine learning.
Acknowledgments
We would like to thank Gautam Kamath and Naila Murray for logistically overseeing the selection process, and our team of expert reviewers and Action Editors, including (in alphabetical order): Naman Agarwal, Anurag Arnab, Richard Baraniuk, Yonatan Bisk, Valentin De Bortoli, Mathilde Caron, João Carreira, Antoni Chan, Swarat Chaudhuri, Changyou Chen, Pin-Yu Chen, Sinho Chewi, Leshem Chosen, Marco Cuturi, Mostafa Dehghani, Yuntian Deng, Carl Doersch, Vincent Dumoulin, Greg Durrett, Dumitru Erhan, Aleksandra Faust, Vincent Fortuin, Scott Geng, Shixiang Shane Gu, Jia-Bin Huang, Phillip Isola, Bahjat Kawar, Abhishek Kumar, Stefan Lee, Fuxin Li, Lihong Li, Yujia Li, Anatole von Lilienfeld, Marlos C. Machado, Stephan M Mandt, Mirco Mutti, Lili Mou, Karthik Narasimhan, Gang Niu, Ivan Oseledets, Gabriel Peyre, Colin Raffel, Marcello Restelli, Fancisco Ruiz, Jonathan Scarlett, Ludwig Schmidt, Evan Shelhamer, Freda Shi, Matthew E. Taylor, Kevin Xu, Qiang Xu, Makoto Yamada, Jong Chul Ye, Hanwang Zhang.