Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

OR24-04 Assessing the Performance of ChatGPT-4.0 in Handling Complex Clinical Queries Based on NCCN Guidelines for Neuroendocrine and Adrenal Tumors

2025·0 Zitationen·Journal of the Endocrine SocietyOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

Abstract Disclosure: S. Pandya: None. T. Makaryan: None. T. Bresler: None. R. Meyer: None. Z. Htway: None. M. Fujita: None. Background: Artificial intelligence (AI) is progressively becoming more prominent in healthcare, particularly for supporting decision-making processes. Although ChatGPT (developed by OpenAI, San Francisco, CA) has demonstrated potential in clinical applications, its ability to navigate the intricate decision-making pathways specific to neuroendocrine and adrenal gland tumors remains unexplored. Objective: This study aimed to evaluate the utility of ChatGPT-4o in providing guidance for clinical decision-making in cases involving neuroendocrine and adrenal gland tumors, using the National Comprehensive Cancer Network® (NCCN) guidelines as a reference. Methods: The study involved a detailed examination of the NCCN guidelines related to neuroendocrine and adrenal gland tumors. Complex clinical scenarios were derived from the guidelines' decision-making algorithms, resulting in the formulation of 133 clinical questions. ChatGPT-4o was queried systematically by two independent physicians. The AI-generated responses were evaluated using a Likert scale, with scores ranging from:5) Fully correct; 4) Mostly correct, with some missing information; 3) Partially correct but lacking completion; 2) Partially incorrect; 1) Completely incorrect.We analyzed the frequency of each score and performed subgroup assessments to compare both the overall Correctness of responses (Likert scores 3-5) and Accuracy (Likert scores 4-5). Additionally, the questions were grouped into several categories such as workup, treatment, surveillance, and diagnostics. A subgroup analysis was conducted using the Kruskal-Wallis test to assess the distribution of scores across these categories. Results: Across 44 pages of the NCCN guidelines, 133 clinical questions were developed and assessed. ChatGPT-4o’s responses were deemed Correct: (Likert scores of 3-5) in 97.7 % of instances, while responses considered Accurate: (Likert scores of 4-5) were observed in 96.2% of cases. The primary areas of weakness were in questions requiring multi-step decision-making, where certain nuances were overlooked or misinterpreted, indicating areas for potential improvement in the AI's clinical utility. The subgroup analysis using the Kruskal-Wallis test showed no statistically significant difference in scoring distribution among the four intervention types (H= 7.5, p= 0.057). Conclusion: ChatGPT-4o shows considerable promise as a supportive tool in clinical decision-making for managing neuroendocrine and adrenal gland tumors. Its ability to provide accurate responses to complex clinical questions based on NCCN guidelines demonstrates its potential utility in enhancing clinician workflows. Further research and refinement are necessary to fully validate its role in assisting with guideline-based clinical decisions. Presentation: Sunday, July 13, 2025

Autoren

Institutionen

Los Robles Hospital & Medical Center(US)

Themen

Radiomics and Machine Learning in Medical ImagingArtificial Intelligence in Healthcare and Education

Volltext beim Verlag öffnen

OR24-04 Assessing the Performance of ChatGPT-4.0 in Handling Complex Clinical Queries Based on NCCN Guidelines for Neuroendocrine and Adrenal Tumors

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen