Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Assessing the accuracy and reproducibility of artificial intelligence-generated medical responses by ChatGPT on Scheuermann’s kyphosis
0
Zitationen
6
Autoren
2024
Jahr
Abstract
Objectives: The study aimed to measure the performance and reproducibility of artificial intelligence in answering frequently asked questions about Scheuermann’s kyphosis and to compare the artificial intelligence with the SOSORT (International Scientific Society on Scoliosis Orthopaedic and Rehabilitation Treatment) consensus in answering case-based questions. Materials and methods: In this cross-sectional study, 75 questions adapted from frequently asked questions about Scheuermann’s kyphosis were queried twice on ChatGPT. Response similarity was assessed to investigate reproducibility. The accuracy of responses was scored based on a scale. Four case studies from the end of the 7 th SOSORT consensus paper on the conservative treatment of idiopathic and Scheuermann’s kyphosis were presented to ChatGPT. Results: ChatGPT provided correct and comprehensive answers to 43 (57.33%) questions, correct but not comprehensive answers to 29 (38.67%) questions, and partially incorrect answers to 3 (4%) questions. ChatGPT performed best in the quality-of-life category, with 18/19 (94.73%) correct scores (score of 1). ChatGPT performed worst in the diagnosis category, with 3/8 (37.5%) correct and comprehensive answers, and in the treatment and follow-up category, with 9/24 (37.5%) correct and comprehensive answers. ChatGPT provided reproducible answers to 92% of the questions. ChatGPT's responses to the treatment of all four case studies were incorrect. Conclusion: While ChatGPT can provide valuable general information regarding Scheuermann’s kyphosis, its ability to offer accurate treatment-related advice is limited.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.773 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.682 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 8.242 Zit.
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
2019 · 6.898 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.