Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
AI-RADS
0
Zitationen
7
Autoren
2026
Jahr
Abstract
BACKGROUND: Despite the growing number of artificial intelligence (AI)-based applications used in radiology, no structured framework exists to assess their case-level reliability or to document overridden outputs in reports. PURPOSE: To develop and evaluate the Artificial Intelligence Reporting and Data System (AI-RADS), a structured framework for an objective, case-level assessment of AI output reliability, clinical utility, and recommended actions in radiology. MATERIALS AND METHODS: The AI-RADS framework was tested in a retrospective, multireader study. Here, 5 board-certified radiologists independently evaluated 350 cases processed by 7 representative AI applications for image-based and generative tasks. Each case was assigned one of 5 AI-RADS categories, applicable modifiers, and an independent correctness rating as a reference. Interreader agreement was quantified using Krippendorff's α with 95% CIs. RESULTS: Substantial interreader agreement was observed for the core AI-RADS categories in both image-based (Krippendorff's α=0.87; 95% CI: 0.83-0.91) and generative AI tasks (Krippendorff's α=0.93; 95% CI: 0.91-0.95). Reader-assigned correctness aligned well with AI-RADS categories 1 to 2, which indicate outputs suitable for integration into clinical workflows. Outputs rated as "incorrect" were predominantly assigned to categories 4 to 5, warranting override or removal from display. CONCLUSION: AI-RADS provides a structured framework for the case-level evaluation of AI output reliability, clinical utility, and consequences for report communication. This multireader study demonstrated substantial interreader agreement and applicability across various AI applications.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.786 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.700 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 8.270 Zit.
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
2019 · 6.908 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.