Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Evaluating ChatGPT's Ability to Solve Higher-Order Questions on the Competency-Based Medical Education Curriculum in Medical Biochemistry
63
Zitationen
2
Autoren
2023
Jahr
Abstract
Background Healthcare-related artificial intelligence (AI) is developing. The capacity of the system to carry out sophisticated cognitive processes, such as problem-solving, decision-making, reasoning, and perceiving, is referred to as higher cognitive thinking in AI. This kind of thinking requires more than just processing facts; it also entails comprehending and working with abstract ideas, evaluating and applying data relevant to the context, and producing new insights based on prior learning and experience. ChatGPT is an artificial intelligence-based conversational software that can engage with people to answer questions and uses natural language processing models. The platform has created a worldwide buzz and keeps setting an ongoing trend in solving many complex problems in various dimensions. Nevertheless, ChatGPT's capacity to correctly respond to queries requiring higher-level thinking in medical biochemistry has not yet been investigated. So, this research aimed to evaluate ChatGPT's aptitude for responding to higher-order questions on medical biochemistry. Objective In this study, our objective was to determine whether ChatGPT can address higher-order problems related to medical biochemistry. Methods This cross-sectional study was done online by conversing with the current version of ChatGPT (14 March 2023, which is presently free for registered users). It was presented with 200 medical biochemistry reasoning questions that require higher-order thinking. These questions were randomly picked from the institution's question bank and classified according to the Competency-Based Medical Education (CBME) curriculum's competency modules. The responses were collected and archived for subsequent research. Two expert biochemistry academicians examined the replies on a zero to five scale. The score's accuracy was determined by a one-sample Wilcoxon signed rank test using hypothetical values. Result The AI software answered 200 questions requiring higher-order thinking with a median score of 4.0 (Q1=3.50, Q3=4.50). Using a single sample Wilcoxon signed rank test, the result was less than the hypothetical maximum of five (p=0.001) and comparable to four (p=0.16). There was no difference in the replies to questions from different CBME modules in medical biochemistry (Kruskal-Wallis p=0.39). The inter-rater reliability of the scores scored by two biochemistry faculty members was outstanding (ICC=0.926 (95% CI: 0.814-0.971); F=19; p=0.001) Conclusion The results of this research indicate that ChatGPT has the potential to be a successful tool for answering questions requiring higher-order thinking in medical biochemistry, with a median score of four out of five. However, continuous training and development with data of recent advances are essential to improve performance and make it functional for the ever-growing field of academic medical usage.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.336 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.207 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.607 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.476 Zit.