Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Computer Vision Meets Large Language Models: Performance of ChatGPT 4.0 on Dermatology Boards-Style Practice Questions
5
Zitationen
4
Autoren
2024
Jahr
Abstract
Background: ChatGPT is a generative artificial intelligence that has numerous professional applications. Applications in medical education are currently being explored. ChatGPT 4.0 performance on image-based dermatology boards-style practice questions has not been assessed. Objective: The objective of this study was to determine the accuracy with which ChatGPT can answer dermatology boards examination practice questions. Methods: 150 multiple-choice questions from the popular question bank DermQbank were inputted into ChatGPT. Of these, 83 were text-only questions and 67 had associated images. These same questions were inputted into ChatGPT again in July 2024. An additional 150 questions were inputted for a total of 300 different questions where 169 were text-only and 133 had associated images. Results: Of the aggregate 300 question data, ChatGPT answered 232 questions correctly (77.3%). ChatGPT performed significantly better with text-only questions than with questions that included images (85.2% (144/169) vs 67.7% (90/133), P<.001). Of image-based questions, ChatGPT performed better with clinical image questions than with dermatopathology questions (69.0% (78/133) vs. 58.8% (10/17), P=.40), but this difference was not statistically significant partially due to the sample size of the dermatopathology questions. Compared to post-graduate year 4 (PGY-4) residents, ChatGPT performed above the 46th percentile. ChatGPT agreed with the answer choice picked by the majority of question bank users 75.3% of the time. Multivariable regression demonstrated that significant predictive variables for ChatGPT answering a question correctly included the percent of dermatology trainees who answered a question correctly and whether the question was text-based (P<.001and P=.004, respectively). Conclusions: ChatGPT answered 77.3% of dermatology board examination practice questions correctly, performing above the 46th percentile of PGY-4 question bank users. If using ChatGPT as a study resource for dermatology board examination preparation, residents should be judicious with exactly how they employ ChatGPT to avoid learning incorrect information.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.393 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.259 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.688 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.502 Zit.