Generated by GPT-5-mini| NeurIPS Reproducibility Challenge | |
|---|---|
| Name | NeurIPS Reproducibility Challenge |
| Established | 2018 |
| Discipline | Machine learning |
| Venue | NeurIPS |
NeurIPS Reproducibility Challenge
The NeurIPS Reproducibility Challenge is an initiative associated with the Conference on Neural Information Processing Systems that organizes community-led efforts to reproduce results from published machine learning papers. It coordinates volunteers, student participants, and senior researchers to verify experimental claims and to archive replication artifacts, engaging stakeholders from major institutions and conferences.
The Challenge mobilizes contributors from institutions such as Stanford University, Massachusetts Institute of Technology, University of California, Berkeley, Carnegie Mellon University, University of Oxford, University of Cambridge, Harvard University, Princeton University, California Institute of Technology, ETH Zurich and Tsinghua University alongside industry labs like Google Research, Microsoft Research, Facebook AI Research, DeepMind, OpenAI, Amazon Web Services, IBM Research, Intel Labs, NVIDIA, Huawei, Baidu Research and Tencent AI Lab. It interfaces with venues and organizations including NeurIPS, ICML, ICLR, ACL (conference), AAAI Conference on Artificial Intelligence, KDD, CVPR, ECCV, ICPR, SIGGRAPH, Workshop on Reproducibility in Machine Learning and arXiv. The effort typically produces artifacts stored on platforms such as GitHub, Zenodo, OSF, Code Ocean, Figshare and shared via community channels like Slack (software), Discord (software), Twitter, Reddit and LinkedIn.
The Challenge originated in response to reproducibility concerns highlighted by reports and initiatives from groups including Nature (journal), Science (journal), AAAS, National Science Foundation, European Research Council, Wellcome Trust, Bill & Melinda Gates Foundation, OpenAI, Mozilla Foundation and The Turing Institute. Early organizational leadership involved academics affiliated with University of Toronto, McGill University, University College London, Rice University, Yale University, Columbia University, Cornell University, Brown University, University of Washington and Johns Hopkins University. Coordination models drew on precedents from projects at ReScience C, Open Science Framework, Replication Crisis Project, Center for Open Science and initiatives by International Committee of Medical Journal Editors and Public Library of Science.
Organizational structures often mirror governance seen at Association for Computing Machinery, Institute of Electrical and Electronics Engineers, Royal Society, American Mathematical Society and Society for Industrial and Applied Mathematics, incorporating student organizers, faculty mentors, and program chairs, and collaborating with editorial boards from venues such as Journal of Machine Learning Research, Transactions on Machine Learning Research and Communications of the ACM.
Participants select target papers published at venues like NeurIPS, ICML, ICLR, CVPR, ACL (conference), KDD or posted on arXiv and attempt to replicate experiments. Submissions include code repositories on platforms such as GitHub, documentation deposited on Zenodo or Figshare, and reproducibility reports submitted to workshops affiliated with NeurIPS, ICLR Workshops or specialized tracks in JMLR or Transactions on Machine Learning Research.
Evaluations are performed by organizers, peer reviewers, and external mentors drawn from Google Research, DeepMind, Facebook AI Research, Microsoft Research and academic groups at Stanford University, MIT, UC Berkeley, Carnegie Mellon University and ETH Zurich. The process often uses continuous integration systems like Travis CI, GitHub Actions and CircleCI for automated checks, and containerization tools such as Docker and Kubernetes to standardize environments. Assessment criteria reference reproducibility frameworks from Center for Open Science, ReproZip, Binder, Code Ocean and editorial policies from Nature (journal) and Science (journal).
Outcomes have included verified reproductions, partial reproductions, and documented failures, influencing authors at institutions such as Stanford University, University of Toronto, University of Washington and ETH Zurich to update code and errata. The Challenge has informed policy discussions at conferences like NeurIPS, ICML and ICLR and influenced repositories maintained by arXiv moderators and organizations such as ACL Anthology and Papers with Code. It has contributed to the development of best-practice guidelines used by journals such as Nature (journal), Science (journal), PLoS ONE and JMLR and funding agencies including National Science Foundation, European Research Council and UK Research and Innovation.
The project has helped train students and reproducibility advocates from programs at Stanford University, UC Berkeley, Carnegie Mellon University, Oxford University, Cambridge University and ETH Zurich, and fostered collaborations with industry teams at Google Research, DeepMind, OpenAI and Microsoft Research.
Critiques involve resource constraints noted by contributors from Stanford University, MIT, UC Berkeley and Harvard University and methodological debates involving groups at Carnegie Mellon University, University College London and ETH Zurich. Limitations include hardware access disparities highlighted by organizations such as NVIDIA, Intel, Amazon Web Services and Google Cloud Platform and concerns about incentives raised by panels at NeurIPS and ICML. Other criticisms reference reproducibility discussions in reports by Nature (journal), Science (journal), AAAS, Royal Society and Center for Open Science.
Further debate involves attribution and citation norms governed by Committee on Publication Ethics and formatting expectations from ACM and IEEE, as well as legal and licensing complexities involving contributors from Stanford University, MIT, Harvard University and corporate labs including Facebook AI Research and Google Research.
Representative reproductions have targeted widely cited works from authors affiliated with Stanford University, MIT, University of Toronto, Carnegie Mellon University, Google Research, DeepMind, OpenAI, Facebook AI Research and Microsoft Research. Case studies often examine landmark papers presented at NeurIPS and ICML and assess codebases hosted on GitHub with artifacts archived on Zenodo or Figshare. Educational case studies have been integrated into coursework at Stanford University, MIT, UC Berkeley, Carnegie Mellon University and University of Cambridge, and featured in panels with representatives from DeepMind, OpenAI, Google Research and Microsoft Research.