Announcing the NeurIPS 2021 Datasets and Benchmarks Track Papers
Joaquin Vanschoren and Serena Yeung
Because there are no good models without good data, and only robust benchmarks measure true progress, NeurIPS launched the new Datasets and Benchmarks track, to serve as a venue for exceptional work focused on creating high-quality datasets, insightful benchmarks, and discussions on how to improve dataset development and data-oriented work more broadly. Further details about the motivation and setup are discussed in our earlier blog post here.
In this inaugural year, we organized two rounds of submissions to get timely feedback from the community. Over the two rounds, we received 484 papers. We were pleasantly surprised by the quality and breadth of these submissions, out of which 174 have been accepted for publication. Please explore the final list of accepted papers.
The reviewing process involved a set of specific attention points, such as the long-term accessibility, ethics, and documentation quality of datasets, and the reproducibility of benchmarks. We are immensely grateful for the tremendous contributions of the 33 area chairs and 548 reviewers to make this new endeavor a success.
Of the 174 accepted papers, approximately 20% are related to computer vision; 15% about natural language processing; 15% about reinforcement learning and simulation environments; 7% about speech recognition; and 6% about multimodal data. In addition, 15% of papers covered meta-analyses, ethics, and explainability, and 22% covered various other topics. Overall, 55% of papers were identified as introducing new datasets, 20% benchmarks, and 25% a combination of both. While these are rough estimates, we hope they provide a sense of the distribution of topics in this year’s track.
The accepted papers will be presented in four oral and poster sessions alongside the main NeurIPS conference orals and posters. Since this is the first year that this track is organized, we will also hold a special symposium event on Thursday of the main conference where the impact and open challenges in creating datasets and running benchmarks will be openly discussed. For the symposium, we are excited to welcome as keynote speakers Olga Russakovsky (Princeton University), Raquel Urtasun (University of Toronto and Waabi), Erin LeDell (H2O.ai), and Douwe Kiela (FAIR).
The full schedule of NeurIPS Datasets and Benchmarks Track events can be found below. Please register for NeurIPS (if you haven’t already) and join us in this exciting new track!
Tuesday, December 7th | |
8.00 am - 10.00 am | Oral presentations |
4.30 pm - 6.00 pm | Poster session |
Wednesday, December 8th | |
4.30pm - 6.00pm | Poster session |
4.00 pm - 5.00 pm | Oral presentations |
Thursday December 9th | |
4.30pm - 6.00pm | Poster session |
7.00 pm - 10.00 pm | Track Symposium |
Friday, December 10th | |
8.00 am - 9.00 am | Oral presentations |
4.30 pm - 6.00 pm | Poster session |