Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Evaluating the Potential of Large Language Models for Vestibular Rehabilitation Education: A Comparison of ChatGPT, Google Gemini, and Clinicians

2024·3 Zitationen·medRxivOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2024

Jahr

Abstract

Abstract Objective We aimed to evaluate the performance of two publicly available large language models, ChatGPT and Google Gemini in response to multiple-choice questions related to vestibular rehabilitation. Methods The study was conducted among 30 physical therapist professionals experienced with VR (vestibular rehabilitation) and 30 physical therapy students. They were asked to complete a Vestibular Knowledge Test (VKT) consisting of 20 multiple-choice questions that were divided into three categories: (1) Clinical Knowledge, (2) Basic Clinical Practice, and (3) Clinical Reasoning. ChatGPT and Google Gemini were tasked with answering the same 20 VKT questions. Three board-certified otoneurologists independently evaluated the accuracy of each response using a 4-level scale, ranging from comprehensive to completely incorrect. Results ChatGPT outperformed Google Gemini with a 70% score on the VKT test, while Gemini scored 60%. Both excelled in Clinical Knowledge with a perfect score of 100% but struggled in Clinical Reasoning with ChatGPT scoring 50% and Gemini scoring 25%. According to three otoneurologic experts, ChatGPT’s accuracy was considered comprehensive in 45% of the 20 questions, while 25% were found to be completely incorrect. ChatGPT provided comprehensive responses in 50% of Clinical Knowledge and Basic Clinical Practice questions, but only 25% in Clinical Reasoning. Conclusion Caution is advised when using ChatGPT and Google Gemini due to their limited accuracy in clinical reasoning. While they provide accurate responses concerning Clinical Knowledge, their reliance on web information may lead to inconsistencies. ChatGPT performed better than Gemini. Healthcare professionals should carefully formulate questions and be aware of the potential influence of the online prevalence of information on ChatGPT’s and Google Gemini’s responses. Combining clinical expertise and clinical guidelines with ChatGPT and Google Gemini can maximize benefits while mitigating limitations. Impact Statement This study highlights the potential utility of large language models like ChatGPT in supplementing clinical knowledge for physical therapists, while underscoring the need for caution in domains requiring complex clinical reasoning. The findings emphasize the importance of integrating technological tools carefully with human expertise to enhance patient care and rehabilitation outcomes.

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationClinical Reasoning and Diagnostic SkillsAcute Ischemic Stroke Management

Volltext beim Verlag öffnen

Evaluating the Potential of Large Language Models for Vestibular Rehabilitation Education: A Comparison of ChatGPT, Google Gemini, and Clinicians

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen