Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Independent bone-level diagnostic accuracy study of an AI tool for detecting appendicular skeletal fractures on radiographs

2026·0 Zitationen·European RadiologyOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2026

Jahr

Abstract

OBJECTIVES: To perform an in-depth evaluation of the diagnostic test accuracy of a commercially available AI tool for assistance in fracture detection on radiographs. MATERIALS AND METHODS: This retrospective study included consecutive patients with trauma radiographs at seven Danish hospitals. The AI output was evaluated using the clinical radiologic report as a reference standard for a binary fracture outcome. The report is based on assessments by an emergency physician, a senior orthopedic surgeon, and a radiology expert. Sensitivity, specificity, positive- and negative predictive values were calculated. Sensitivity and specificity were additionally stratified for children, degenerative disease, metal, old fractures, casting, obvious fractures, and inter-hospital differences. Bone-wise sensitivity and specificity were assessed for multiple fracture cases and individual bones. RESULTS: The study sample consisted of 2783 patients (median age 38 years, IQR, 21, 64, 1443 female), and 948 (34%) had the target finding. The AI tool demonstrated an overall sensitivity of 89% (95% CI: 87%-91%) and specificity of 88% (95% CI: 86%-89%). The specificity was 57% (95% CI: 49%-65%) in examinations with old fractures. Bone-wise sensitivity for carpal fractures ranged from other carpals 25% (95% CI: 1%-81%] to triquetrum 75% (95% CI: 43%-95%). Tarsal fractures ranged from medial cuneiform 0% (95% CI: 0%-60%) to talus 53% (95% CI: 27%-79%). CONCLUSION: The AI tool demonstrated high overall diagnostic accuracy and performed robustly across most specific situations. However, specificity was substantially reduced in the presence of old fractures. The bone-wise analysis showed great variability, with a pattern of poor accuracy for short, irregular bones. KEY POINTS: Question Can a commercially available AI tool reliably detect fractures across anatomical regions, confounding factors, and individual bones -and are there patterns in diagnostic limitations? Findings The AI tool achieved 89% sensitivity and 88% specificity with consistent accuracy across subgroups. However, accuracy dropped for old fractures and irregular short bones. Clinical relevance Despite broad regulatory approval, AI fracture tools may overlook clinically relevant weaknesses. Our in-depth evaluation highlights limitations, guiding responsible clinical use and future research to support safe AI implementation in radiology and informed medicolegal regulation.

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationOrthopedic Surgery and RehabilitationBone fractures and treatments

Volltext beim Verlag öffnen

Independent bone-level diagnostic accuracy study of an AI tool for detecting appendicular skeletal fractures on radiographs

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen