Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Understanding Machine Learning testing in practice

2026·0 Zitationen·Journal of Systems and SoftwareOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2026

Jahr

Abstract

Machine Learning is increasingly embedded in critical software systems, making their quality assurance a matter of growing concern. While the research community has proposed several techniques for testing ML-enabled systems, there is limited empirical evidence on whether these techniques are adopted in practice or align with developers’ testing workflows. This paper presents a two-step empirical investigation aimed at characterizing the current landscape of ML testing in real-world development. Our goal is to understand how developers approach testing, whether proposed techniques are adopted, and what barriers hinder their implementation. We designed a mixed-method study that triangulates insights from two complementary sources: (1) a mining study of 398 open-source repositories to analyze implemented testing strategies and tool usage; and (2) a survey of 100 practitioners to capture perceptions, motivations, and practical challenges. Our findings reveal that developers rely heavily on foundational strategies like Smoke Testing and Rule-Based Checking , implemented through custom testing logic built on general-purpose libraries (e.g., PyTest , NumPy ). Conversely, we identified a critical adoption gap in specialized tools and advanced techniques such as Metamorphic Testing , which are rarely implemented despite their academic prominence. Our survey indicates that this gap is driven by practical barriers, including high integration costs and a poor fit with existing developer workflows. These findings suggest that future research and tooling must prioritize usability, integration, and a clearer alignment with the pragmatic needs of developers. • Large-scale mixed-method investigation of ML testing practices in real-world development. • Triangulated insights from 398 open-source repositories (2, 018 test files) and 100 practitioners. • Practitioners rely on foundational strategies like Smoke Testing, implemented via custom solutions. • Critical adoption gap for specialized tools and advanced techniques due to workflow integration barriers. • Released datasets, analysis scripts, and a technical report to enable replication.

Autoren

Institutionen

Themen

Explainable Artificial Intelligence (XAI)Artificial Intelligence in Healthcare and EducationMachine Learning and Data Classification

Volltext beim Verlag öffnen

Understanding Machine Learning testing in practice

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen