Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Capability of chatbots powered by large language models to support the screening process of scoping reviews: a feasibility study
0
Zitationen
4
Autoren
2025
Jahr
Abstract
Objectives: The surge in publications increases screening time required to maintain high-quality literature reviews. One of the most time-consuming phases is title and abstract screening. Machine learning tools have semi-automated this process for systematic reviews, with limited success for scoping reviews. ChatGPT, a chatbot based on a large language model, might support scoping review screening by identifying key concepts and themes. We hypothesize that ChatGPT outperforms the semi-automated tool Rayyan, increasing efficiency at acceptable costs while maintaining a low type II error. Materials and Methods: We conducted a retrospective study using human screening decisions on a scoping review of 15 307 abstracts as a benchmark. A training set of 100 abstracts was used for prompt engineering for ChatGPT and training Rayyan. Screening decisions for all abstracts were obtained via an application programming interface for ChatGPT and manually for Rayyan. We calculated performance metrics, including accuracy, sensitivity, and specificity with Stata. Results: ChatGPT 4.0 decided upon 15 306 abstracts, vastly outperforming Rayyan. ChatGPT 4.0 demonstrated an accuracy of 68%, specificity of 67%, sensitivity of 88%-89%, a negative predictive value of 99%, and an 11% false negative rate when compared to human researchers' decisions. The workload savings were at 64% reasonable costs. Discussion and Conclusion: This study demonstrated ChatGPT's potential to be applied in the first phase of the literature appraisal process for scoping reviews. However, human oversight remains paramount. Additional research on ChatGPT's parameters, the prompts and screening scenarios is necessary in order to validate these results and to develop a standardized approach.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.693 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.598 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 8.124 Zit.
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
2019 · 6.871 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.