OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 08.05.2026, 13:15

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Latent Supervision: A Method for Improved Performance and Calibration of Machine Learning Classification Models in Ophthalmology

2026·0 Zitationen·Ophthalmology ScienceOpen Access
Volltext beim Verlag öffnen

0

Zitationen

9

Autoren

2026

Jahr

Abstract

Purpose: Standard supervised learning assumes deterministic labels (e.g., positive or negative, present or absent), neglecting the diagnostic uncertainty inherent in clinical practice. We propose latent supervision, a novel algorithm which applies latent class analysis to incorporate multiple diagnostic tests or expert opinions to produce probabilistic labels (soft labels) for more accurate, calibrated ophthalmic artificial intelligence classifiers. Design: Comparison of calibration and classification performance among ophthalmic computer vision model training techniques. Subjects: Eleven thousand three hundred fifty-eight children aged 0 to 9 years evaluated as part of multinational trachoma screening, and 2100 adult subjects with bacterial or fungal keratitis (or both) collated from multiple prior clinical studies. Methods: We compared latent supervision against supervised learning methods in 2 computer vision scenarios: trachoma screening with labels from different grader teams, and pathogen (fungal vs. bacterial) differentiation in infectious keratitis using labels derived from culture and smear results. Main Outcome Measures: Classification performance was measured using the area under the receiver operating characteristic curve, F1 score, and accuracy. Model calibration was assessed by calibration curves and Brier score decomposition. Results: For the trachoma screening scenario, the grader 1, grader 2, grader 3, and ensemble supervised models had area under the receiver operating characteristic curves of 0.88, 0.91, 0.90, and 0.93, respectively. The latent supervision model had an area under the receiver operating characteristic curve of 0.94 with better calibration. For the infectious keratitis scenario, the culture-supervised model outperformed the smear-supervised model in bacterial keratitis classification (area under the receiver operating characteristic curve 0.87 vs. 0.79), while the opposite was true in fungal keratitis classification (area under the receiver operating characteristic curve 0.78 vs. 0.85). The latent supervision model performed consistently well on both tasks with an area under the receiver operating characteristic curve of 0.86, and showed good calibration. Conclusions: Latent supervision provides a computationally inexpensive method to train artificial intelligence models using probabilistic labels, which model the diagnostic uncertainty inherent in medicine. In both test cases evaluated, latent supervision appeared to provide benefits both in terms of classification performance and model calibration. The latter is particularly important to improve public trust and enable meaningful clinical implementation. Financial Disclosures: Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.

Ähnliche Arbeiten