OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 27.03.2026, 13:26

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Title: Assessing Quality of Scenario-Based Multiple-Choice Questions in Physiology: Faculty-Generated vs. ChatGPT-Generated Questions among Phase I Medical Students

2025·11 Zitationen·International Journal of Artificial Intelligence in EducationOpen Access
Volltext beim Verlag öffnen

11

Zitationen

3

Autoren

2025

Jahr

Abstract

Abstract The integration of Artificial Intelligence (AI), particularly Chatbot Generative Pre-Trained Transformer (ChatGPT), in medical education has introduced new possibilities for generating various educational resources for assessments. However, ensuring the quality of ChatGPT-generated assessments poses challenges, with limited research in the literature addressing this issue. Recognizing this gap, our study aims to investigate the quality of ChatGPT-based assessment. In this study among first-year medical students, a crossover design was employed to compare scenario-based multiple-choice questions (SBMCQs) crafted by both faculty members and ChatGPT through item analysis to determine the quality of assessment. The study comprised three main phases: development, implementation, and evaluation of SBMCQs. During the development phase, both faculty members and ChatGPT generated 60 SBMCQs each, covering topics related to cardiovascular, respiratory, and endocrinology. These questions underwent assessment by independent reviewers, after which 80 SBMCQs were selected for the tests. Subsequently, in the implementation phase, one hundred and twenty students, divided into two batches, were assigned to receive either faculty-generated or ChatGPT-generated questions across four test sessions. The collected data underwent rigorous item analysis and thematic analysis to evaluate the effectiveness and quality of the questions generated by both parties. Only 9 of ChatGPT’s SBMCQs met ideal criteria MCQ on Difficulty Index, Discrimination Index and Distractor Effectiveness contrasting with 19 from faculty. Moreover, ChatGPT’s questions exhibited a higher rate of nonfunctional distractors (33.75% vs. faculty’s 13.75%). During focus group discussion, faculty highlighted importance of educators in reviewing, refining, and validating ChatGPT-generated SBMCQs to ensure their appropriateness within the educational context.

Ähnliche Arbeiten

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationClinical Reasoning and Diagnostic SkillsInnovations in Medical Education
Volltext beim Verlag öffnen