Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Reduced, Reused and Recycled: The Life of a Dataset in Machine Learning\n Research
57
Zitationen
4
Autoren
2021
Jahr
Abstract
Benchmark datasets play a central role in the organization of machine\nlearning research. They coordinate researchers around shared research problems\nand serve as a measure of progress towards shared goals. Despite the\nfoundational role of benchmarking practices in this field, relatively little\nattention has been paid to the dynamics of benchmark dataset use and reuse,\nwithin or across machine learning subcommunities. In this paper, we dig into\nthese dynamics. We study how dataset usage patterns differ across machine\nlearning subcommunities and across time from 2015-2020. We find increasing\nconcentration on fewer and fewer datasets within task communities, significant\nadoption of datasets from other tasks, and concentration across the field on\ndatasets that have been introduced by researchers situated within a small\nnumber of elite institutions. Our results have implications for scientific\nevaluation, AI ethics, and equity/access within the field.\n
Ähnliche Arbeiten
The global landscape of AI ethics guidelines
2019 · 4.612 Zit.
The Limitations of Deep Learning in Adversarial Settings
2016 · 3.876 Zit.
Trust in Automation: Designing for Appropriate Reliance
2004 · 3.431 Zit.
Fairness through awareness
2012 · 3.292 Zit.
Mind over Machine: The Power of Human Intuition and Expertise in the Era of the Computer
1987 · 3.184 Zit.