Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Iterative Dual-AI Consultation for Error Detection in Clinical Medicine: A Case Study Demonstrating Convergent Validity Through Cross-Validation of Large Language Models.
0
Zitationen
1
Autoren
2026
Jahr
Abstract
Background: Large language models have demonstrated remarkable promise in medical data analysis, but serious concerns about reliability and error propagation persist. This study reports a novel approach of using iterative consultation between two independent AI systems to analyze complex clinical neuroimaging data. Methods: A 63-year-old woman with a family history of Alzheimer's disease and Parkinsonism underwent brain MRI volumetry showing apparent 10-13% increases in gray matter volume following intensive multimodal interventions (Functional Medicine and HYLANE™ treatment). Despite clinical improvement, objective cognitive testing declined during the same period. Two AI systems (Claude and Perplexity) independently analyzed neuroimaging reports, cognitive testing, and clinical data over 5-7 iterative cycles, systematically challenging each other's interpretations. Results: Initial analyses diverged substantially (45-60 percentage-point difference in probability estimates). Through autonomous error detection and cross-validation, systems converged to a consensus (<10 percentage-point difference). Critical autonomous discoveries included: (1) 3.5% increase in total intracranial volume (physiologically impossible, indicating measurement artifact), (2) 11-month temporal gap between cognitive testing and MRI, and (3) literature review revealing hyperbaric oxygen therapy produces maximum 1-2% volumetric changes. Final consensus: modest real improvements (2-4%) embedded within measurement artifact (3-5%). Conclusions: Dual-AI iterative consultation achieved autonomous error detection, literature integration, and convergent validity without requiring human identification of critical flaws. This approach may enhance reliability in complex clinical decision-making while maintaining appropriate physician oversight. Keywords: artificial intelligence, clinical decision support, neuroimaging, automated volumetry, large language models, convergent validity, error detection.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.707 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.613 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 8.159 Zit.
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
2019 · 6.875 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.