Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Artificial Intelligence in Occupational Health Surveillance: Evaluating AI-Assisted ILO Classification of Radiographs of Pneumoconioses
0
Zitationen
8
Autoren
2026
Jahr
Abstract
BACKGROUND: Pneumoconioses remain an important occupational health issue, particularly in low- and middle-income countries. The International Labour Organization (ILO) Classification standardizes chest radiograph interpretation but requires trained readers and is affected by inter-reader variability. This study evaluated whether generative multimodal artificial intelligence (AI) models can approximate ILO-based diagnostic reasoning. METHODS: Eighty-two chest radiographs from the official NIOSH B Reader syllabus were analysed using four AI systems (GPT-4o, GPT-5, MedGemma-4B, MedGemma-27B). Each image was evaluated with a standardized prompt based on the 2022 revised ILO guidelines using deterministic settings. Model outputs were mapped to ILO codes and compared with the official answer keys of the ILO Standard Radiograph Set used for B Reader training and examination. Performance metrics included balanced accuracy, sensitivity, specificity, precision, and Matthews correlation coefficient (MCC). Bootstrap 95% confidence intervals, McNemar's test, and Cohen's κ assessed performance variability and agreement. RESULTS: All four AI models showed moderate diagnostic performance, with balanced accuracy ranging from 60.8% to 70.3%. Sensitivity remained limited (35.5%-54.9%), while specificity was consistently high (84.6%-86.2%). MedGemma-27B performed best for small opacities, GPT-5 for pleural abnormalities and for technical quality. Large opacities and rare findings were systematically under-detected. Statistical comparisons showed significant differences between models, although agreement patterns were broadly similar. CONCLUSION: All AI models partially followed structured ILO radiographic criteria but did not achieve expert-level performance, confirming that they cannot replace certified B Readers. Larger, real-world datasets are needed to assess their potential clinical utility as supportive tools in occupational health surveillance programs.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.693 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.598 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 8.124 Zit.
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
2019 · 6.871 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.