OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 28.03.2026, 02:16

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

A Pilot Study on Generative Artificial Intelligence’s Reliability in Qualitative Research Quality Appraisal Using CASP and JBI Checklists

2025·0 Zitationen·INQUIRY The Journal of Health Care Organization Provision and FinancingOpen Access
Volltext beim Verlag öffnen

0

Zitationen

4

Autoren

2025

Jahr

Abstract

Generative artificial intelligence (genAI) tools are transforming workflows, with growing interest in their potential applications in qualitative research. While the use of genAI in facilitating the systematic review process has been explored, its application in the quality appraisal of qualitative research remains to be understood. This pilot study aims to evaluate the degree to which ChatGPT appraises qualitative research using popular appraisal tools compared to human assessments. Two reviewers applied the Critical Appraisal Skills Program (CASP) and Joanna Briggs Institute (JBI) checklists for qualitative research to studies identified through a previously published review (n = 21). Next, iteratively developed prompts along with a copy of each study were uploaded to ChatGPT to instruct it to appraise each article. Interrater reliability measures and crude agreements were conducted to estimate the level of agreement between human and genAI assessments. Interrater reliability assessments between human and ChatGPT (GPT-5) revealed no agreement to moderate agreement for CASP checklist items (kappa: <.00-.46; crude agreement: 23.8%-100%) and from none to substantial for JBI items (kappa: <.00-.83; crude agreement: 4.8%-95.2%). Agreement was highest for reporting-based elements such as study aims, ethics approval, value of research (CASP), and participant voices and conclusions (JBI). Disagreements were greatest for interpretive and context-dependent items such as research design, researcher-participant relationships, and worldview-methodology congruity. Findings demonstrate that ChatGPT (GPT-5) can reliably identify objective components yet performs inconsistently when assessing items requiring nuance and contextual understanding across both checklists. Currently, any adoption of genAI for quality appraisal of qualitative research must be carefully applied only alongside human assessments and uphold principles of transparency and data privacy.

Ähnliche Arbeiten

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationMeta-analysis and systematic reviewsQualitative Research Methods and Applications
Volltext beim Verlag öffnen