Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
A Bibliometric Analysis of Benchmark Datasets in Machine Learning Research: Insights from Scopus (2001–2024)
0
Zitationen
2
Autoren
2025
Jahr
Abstract
Benchmark datasets are critical for advancing machine learning (ML), enabling standardized evaluation, reproducibility, and progress in domains like computer vision, natural language processing, and cybersecurity. This study conducts a comprehensive bibliometric analysis of benchmark datasets research in ML from 2001 to 2024, using Scopus data to examine citation patterns, publication trends, geographic and institutional contributions, and underexplored areas. Findings highlight the dominance of datasets such as DAVIS, UIEB, and ISCX Intrusion Detection, alongside leading venues like Lecture Notes in Computer Science and Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. China and the United States lead in publication output, with institutions like Wuhan University driving significant contributions. A 15-fold publication increase since 2001, particularly post-2015, reflects the rise of deep learning. Quantitative keyword analysis and co occurrence network identify underexplored areas, including cheminformatics, seismology, fake content detection, and 3D data processing, revealing critical gaps in interdisciplinary benchmarking. The study emphasizes ethical dataset development, addressing biases and privacy to align with sustainable AI practices. These insights guide the creation of standardized, inclusive evaluation frameworks to enhance ML research reproducibility and societal impact.
Ähnliche Arbeiten
Federated Learning: Challenges, Methods, and Future Directions
2020 · 4.398 Zit.
Deep Learning: Methods and Applications
2014 · 3.306 Zit.
Mobile Edge Computing: A Survey on Architecture and Computation Offloading
2017 · 2.900 Zit.
Machine Learning: An Artificial Intelligence Approach
2013 · 2.639 Zit.
Machine learning and deep learning
2021 · 2.335 Zit.