Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Generative Artificial Intelligence Responses to Common Patient-Centric Hand and Wrist Surgery Questions: A Quality and Usability Analysis
1
Zitationen
4
Autoren
2025
Jahr
Abstract
<b>Background:</b> Due to the rapid evolution of generative artificial intelligence (AI) and its implications on patient education, there is a pressing need to evaluate AI responses to patients' medical questions. This study assessed the quality and usability of responses received from two prominent AI platforms to common patient-centric hand and wrist surgery questions. <b>Methods:</b> Twelve commonly encountered hand and wrist surgery patient questions were inputted twice into both Gemini and ChatGPT, generating 48 responses. Each response underwent a content analysis, followed by assessment for quality and usability with three scoring tools: DISCERN, Suitability Assessment of Materials (SAM) and the AI Response Metric (AIRM). Statistical analyses compared the features and scores of the outputs when stratified by platform, question type and response order. <b>Results:</b> Responses earned mean overall scores of 55.7 ('good'), 57.2% ('adequate') and 4.4 for DISCERN, SAM and AIRM, respectively. No responses provided citations. Wrist question responses had significantly higher DISCERN (<i>p</i> < 0.01) and AIRM (<i>p</i> = 0.02) scores compared to hand responses. Second responses had significantly higher AIRM (<i>p</i> < 0.01), but similar DISCERN (<i>p</i> = 0.76) and SAM (<i>p</i> = 0.11), scores compared to the first responses. Gemini's DISCERN (<i>p</i> = 0.04) and SAM (<i>p</i> < 0.01) scores were significantly higher than ChatGPT's corresponding metrics. <b>Conclusions:</b> Although responses are generally 'good' and 'adequate', there is variable quality with respect to platform used, type of question and response order. Given the diversity of publicly available AI platforms, it is important to understand the quality and usability of information patients may encounter during their search for answers to common hand and wrist surgery questions. <b>Level of Evidence:</b> Level IV (Therapeutic).
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.349 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.219 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.631 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.480 Zit.