Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Evaluation of large language models in assigning PI-RADS v2.1 categories for prostate MRI reports
0
Zitationen
2
Autoren
2026
Jahr
Abstract
This study aimed to evaluate the performance of large language models (LLMs) in classifying prostate MRI reports according to the Prostate Imaging–Reporting and Data System (PIRADS) version 2.1, and to validate their use in supporting clinical decisions in prostate cancer treatment. This retrospective study included 146 patients. Four LLMs — GPT-4o, GPT-o1, Google Gemini 1.5 Pro and Google Gemini 2.0 Experimental Advanced — were tested on standardised, structured prostate MRI reports. A two-radiologist consensus reference standard was used to compare model performance. Agreement was measured using weighted Cohen’s kappa, and accuracy and F1 scores were calculated for three PI-RADS risk groups: low (1–2), intermediate (3) and high (4–5). Performance varied by model. GPT-o1 achieved the highest level of agreement with radiologists (κ = 0.867), followed by GPT-4o (κ = 0.743), Gemini 1.5 Pro (κ = 0.728) and Gemini 2.0 Experimental Advanced (κ = 0.664). GPT-o1 achieved the highest F1 scores for the low-risk (0.93) and high-risk (1.00) groups, demonstrating moderate performance for the PI-RADS 3 group (0.75). All models showed weak performance for PI-RADS 3 (F1 range: 0.54–0.75). Most importantly, none of the models produced invalid results outside the target PI-RADS 1–5 range. LLMs show potential for automating PI-RADS classification from MRI reports, with GPT-o1 demonstrating the best overall performance. However, their failure in PI-RADS 3 lesions indicates that multicentre validation, larger datasets and multimodality integration are needed before they can be used clinically for prostate cancer diagnosis and urological decision-making. Not applicable. This retrospective study did not involve a clinical trial.
Ähnliche Arbeiten
Docetaxel plus Prednisone or Mitoxantrone plus Prednisone for Advanced Prostate Cancer
2004 · 5.693 Zit.
Decision Curve Analysis: A Novel Method for Evaluating Prediction Models
2006 · 5.090 Zit.
Increased Survival with Enzalutamide in Prostate Cancer after Chemotherapy
2012 · 4.535 Zit.
Biochemical Outcome After Radical Prostatectomy, External Beam Radiation Therapy, or Interstitial Radiation Therapy for Clinically Localized Prostate Cancer
1998 · 4.488 Zit.
Screening and Prostate-Cancer Mortality in a Randomized European Study
2009 · 3.990 Zit.