Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
OR24-04 Assessing the Performance of ChatGPT-4.0 in Handling Complex Clinical Queries Based on NCCN Guidelines for Neuroendocrine and Adrenal Tumors
0
Zitationen
6
Autoren
2025
Jahr
Abstract
Abstract Disclosure: S. Pandya: None. T. Makaryan: None. T. Bresler: None. R. Meyer: None. Z. Htway: None. M. Fujita: None. Background: Artificial intelligence (AI) is progressively becoming more prominent in healthcare, particularly for supporting decision-making processes. Although ChatGPT (developed by OpenAI, San Francisco, CA) has demonstrated potential in clinical applications, its ability to navigate the intricate decision-making pathways specific to neuroendocrine and adrenal gland tumors remains unexplored. Objective: This study aimed to evaluate the utility of ChatGPT-4o in providing guidance for clinical decision-making in cases involving neuroendocrine and adrenal gland tumors, using the National Comprehensive Cancer Network® (NCCN) guidelines as a reference. Methods: The study involved a detailed examination of the NCCN guidelines related to neuroendocrine and adrenal gland tumors. Complex clinical scenarios were derived from the guidelines' decision-making algorithms, resulting in the formulation of 133 clinical questions. ChatGPT-4o was queried systematically by two independent physicians. The AI-generated responses were evaluated using a Likert scale, with scores ranging from:5) Fully correct; 4) Mostly correct, with some missing information; 3) Partially correct but lacking completion; 2) Partially incorrect; 1) Completely incorrect.We analyzed the frequency of each score and performed subgroup assessments to compare both the overall Correctness of responses (Likert scores 3-5) and Accuracy (Likert scores 4-5). Additionally, the questions were grouped into several categories such as workup, treatment, surveillance, and diagnostics. A subgroup analysis was conducted using the Kruskal-Wallis test to assess the distribution of scores across these categories. Results: Across 44 pages of the NCCN guidelines, 133 clinical questions were developed and assessed. ChatGPT-4o’s responses were deemed Correct: (Likert scores of 3-5) in 97.7 % of instances, while responses considered Accurate: (Likert scores of 4-5) were observed in 96.2% of cases. The primary areas of weakness were in questions requiring multi-step decision-making, where certain nuances were overlooked or misinterpreted, indicating areas for potential improvement in the AI's clinical utility. The subgroup analysis using the Kruskal-Wallis test showed no statistically significant difference in scoring distribution among the four intervention types (H= 7.5, p= 0.057). Conclusion: ChatGPT-4o shows considerable promise as a supportive tool in clinical decision-making for managing neuroendocrine and adrenal gland tumors. Its ability to provide accurate responses to complex clinical questions based on NCCN guidelines demonstrates its potential utility in enhancing clinician workflows. Further research and refinement are necessary to fully validate its role in assisting with guideline-based clinical decisions. Presentation: Sunday, July 13, 2025
Ähnliche Arbeiten
New response evaluation criteria in solid tumours: Revised RECIST guideline (version 1.1)
2008 · 28.945 Zit.
TNM Classification of Malignant Tumours
1987 · 16.123 Zit.
A survey on deep learning in medical image analysis
2017 · 13.632 Zit.
Reduced Lung-Cancer Mortality with Low-Dose Computed Tomographic Screening
2011 · 10.780 Zit.
The American Joint Committee on Cancer: the 7th Edition of the AJCC Cancer Staging Manual and the Future of TNM
2010 · 9.111 Zit.