Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Estimation of IPSS and OABSS scores using ChatGPT-4o: a comparative validation study in Korea

2026·0 Zitationen·BMC UrologyOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2026

Jahr

Abstract

To evaluate the performance of ChatGPT-4o in estimating International Prostate Symptom Score (IPSS) and Overactive Bladder Symptom Score (OABSS) based on patients’ natural language descriptions and full outpatient records, compared to actual questionnaire scores. This study included 91 patients, of whom 52 completed IPSS and 77 completed OABSS. ChatGPT-4o was prompted with verbatim symptom statements and full medical records written by a urologist. Predicted scores were compared to actual scores using paired t-tests, weighted Cohen’s kappa for item-level agreement, Spearman’s correlation for total scores, and Bland–Altman plots for bias. Diagnostic classifications (lower urinary tract symptoms [LUTS]: IPSS ≥8; overactive bladder [OAB]: OABSS ≥3 with urgency ≥2) were assessed using McNemar’s test and receiver operating characteristic curve analysis. Mean IPSS scores estimated by ChatGPT-4o were statistically significantly lower than patient-reported scores (11.2 vs. 13.6, p = 0.006), whereas OABSS scores did not differ significantly between the two methods (6.99 vs. 6.86, p = 0.686). Diagnostic agreement was high: LUTS in 42 (actual) vs. 38 (GPT) patients, and OAB in 51 vs. 50 patients. Area under curve was 0.81 for IPSS and 0.91 for OABSS. Kappa values ranged from 0.23–0.81 (IPSS) and 0.44–0.71 (OABSS), with highest concordance in quality of life (QoL) and urgency incontinence. Spearman’s correlation coefficient was 0.60 (IPSS) and 0.70 (OABSS). Accuracy was lower in first-visit patients. GPT-4o estimated IPSS and OABSS with moderate but clinically acceptable accuracy. Its performance was comparable regarding diagnostic classification, particularly for QoL and OABSS. ChatGPT-4o may complement traditional questionnaires, particularly with missing or incomplete patient-reported data.

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationRadiomics and Machine Learning in Medical ImagingClinical Reasoning and Diagnostic Skills

Volltext beim Verlag öffnen

Estimation of IPSS and OABSS scores using ChatGPT-4o: a comparative validation study in Korea

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen