Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Towards a framework for interoperability and reproducibility of predictive models
8
Zitationen
7
Autoren
2023
Jahr
Abstract
The development and deployment of machine learning (ML) models for biomedical research and healthcare currently lacks standard methodologies. Although tools for model replication are numerous, without a unifying blueprint it remains difficult to scientifically reproduce predictive ML models for any number of reasons (e.g., assumptions regarding data distributions and preprocessing, unclear test metrics, etc.) and ultimately, questions around generalizability and transportability are not readily answered. To facilitate scientific reproducibility, we built upon the Predictive Model Markup Language (PMML) to capture essential information. As a key component of the PREdictive Model Index and Exchange REpository (PREMIERE) platform, we present the Automated Metadata Pipeline (AMP) for conversion of a given predictive ML model into an extended PMML file that autocompletes an ML-based checklist, assessing model elements for interoperability and reproducibility. We demonstrate this pipeline on multiple test cases with three different ML algorithms and health-related datasets, providing a foundation for future predictive model reproducibility, sharing, and comparison. Problem: The development and deployment of machine learning (ML) models for biomedical research and healthcare currently lack standard methodologies, leading to problems of scientific reproducibility and interoperability. What is Already Known: Although there are many tools for model replication, without a unifying blueprint, it remains difficult to scientifically reproduce predictive ML models for any number of reasons. Moreover, questions around generalizability and transportability are not readily answered. What this Paper Adds: This study builds upon the Predictive Model Markup Language (PMML) to capture essential information and presents the Automated Metadata Pipeline (AMP) for conversion of a given predictive ML model into an extended PMML file that auto-completes an ML-based checklist, assessing model elements for interoperability and reproducibility. We demonstrate this pipeline on multiple test cases with three different ML algorithms and health-related datasets. The proposed AMP provides a framework for automating the completion and evaluation of a comprehensive ML model checklist, increasing compliance and ultimately, predictive model reproducibility, sharing, and comparison of predictive models by ensuring all appropriate information is available.
Ähnliche Arbeiten
UCSF Chimera—A visualization system for exploratory research and analysis
2004 · 47.510 Zit.
SciPy 1.0: fundamental algorithms for scientific computing in Python
2020 · 36.992 Zit.
Clustal W and Clustal X version 2.0
2007 · 28.999 Zit.
The REDCap consortium: Building an international community of software platform partners
2019 · 23.484 Zit.
Array programming with NumPy
2020 · 21.602 Zit.