Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Artificial Intelligence Can Answer Postoperative Questions About Distal Radius Fractures—But Can Patients Understand the Answers?
1
Zitationen
4
Autoren
2025
Jahr
Abstract
Purpose: The purpose of this study was to assess the validity, reliability, and readability of responses to common patient questions about postoperative from ChatGPT, Microsoft Copilot, and Google Gemini. Methods: Twenty-seven thoroughly vetted questions regarding distal radius fractures repair surgery were compiled and entered into ChatGPT 4, Gemini, and Copilot. The responses were analyzed for quality, accuracy, and readability using the DISCERN scale, the Journal of the American Medical Association benchmark criteria, Flesch-Kincaid Reading Ease Score, and Flesch-Kincaid Grade Level. Citations provided by Google Gemini and Microsoft Copilot were further categorized by source of reference. Five questions were resubmitted, requesting response simplification. The responses were re-evaluated using the same metrics. Results: All three artificial intelligence platforms produced answers that were considered "good" quality (DISCERN scores >50). Copilot had the highest quality of information (68.3), followed by Gemini (62.9) and ChatGPT (52.9). The information provided by Copilot demonstrated the highest reliability, with a Journal of the American Medical Association benchmark criterion of 3 (of 4) compared with Gemini (1) and ChatGPT (0). All three platforms generated complex texts with Flesch-Kincaid Reading Ease Scores ranging between 35.8 and 41.4 and Flesch-Kincaid Grade Level scores between 10.5 and 12.1, indicating a minimum of high-school graduate reading level required. After simplification, Gemini's reading level remained unchanged, whereas ChatGPT improved to that of a seventh-grade reading level and Copilot improved to that of an eighth-grade reading level. Copilot had a higher number of references (74) compared with Gemini (36). Conclusions: All three platforms provided safe and reliable answers to postoperative questions about distal radius fractures. High reading levels provided by AI remain the biggest barrier to patient accessibility. Clinical relevance: For the current state of mainstream AI platforms, they are best suited as adjunct tools to support, rather than replace, clinical communication from health care workers.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.758 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.666 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 8.220 Zit.
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
2019 · 6.896 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.