Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Estimation of IPSS and OABSS scores using ChatGPT-4o: a comparative validation study in Korea
0
Zitationen
8
Autoren
2026
Jahr
Abstract
To evaluate the performance of ChatGPT-4o in estimating International Prostate Symptom Score (IPSS) and Overactive Bladder Symptom Score (OABSS) based on patients’ natural language descriptions and full outpatient records, compared to actual questionnaire scores. This study included 91 patients, of whom 52 completed IPSS and 77 completed OABSS. ChatGPT-4o was prompted with verbatim symptom statements and full medical records written by a urologist. Predicted scores were compared to actual scores using paired t-tests, weighted Cohen’s kappa for item-level agreement, Spearman’s correlation for total scores, and Bland–Altman plots for bias. Diagnostic classifications (lower urinary tract symptoms [LUTS]: IPSS ≥8; overactive bladder [OAB]: OABSS ≥3 with urgency ≥2) were assessed using McNemar’s test and receiver operating characteristic curve analysis. Mean IPSS scores estimated by ChatGPT-4o were statistically significantly lower than patient-reported scores (11.2 vs. 13.6, p = 0.006), whereas OABSS scores did not differ significantly between the two methods (6.99 vs. 6.86, p = 0.686). Diagnostic agreement was high: LUTS in 42 (actual) vs. 38 (GPT) patients, and OAB in 51 vs. 50 patients. Area under curve was 0.81 for IPSS and 0.91 for OABSS. Kappa values ranged from 0.23–0.81 (IPSS) and 0.44–0.71 (OABSS), with highest concordance in quality of life (QoL) and urgency incontinence. Spearman’s correlation coefficient was 0.60 (IPSS) and 0.70 (OABSS). Accuracy was lower in first-visit patients. GPT-4o estimated IPSS and OABSS with moderate but clinically acceptable accuracy. Its performance was comparable regarding diagnostic classification, particularly for QoL and OABSS. ChatGPT-4o may complement traditional questionnaires, particularly with missing or incomplete patient-reported data.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.312 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.169 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.564 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.466 Zit.