Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Assessing the methodologic quality of systematic reviews using generative large language models
0
Zitationen
6
Autoren
2025
Jahr
Abstract
INTRODUCTION: We aimed to evaluate whether generative large language models (LLMs) can accurately assess the methodologic quality of systematic reviews (SRs). METHODS: A total of 114 SRs from five leading urology journals were included in the study. Human reviewers graded each of the SRs in duplicate, with differences adjudicated by a third expert. We created a customized generative artificial intelligence (generative pre-trained transformer [GPT]), "Urology AMSTAR 2 Quality Assessor," and graded the 114 SRs in three iterations using a zero-shot method. We performed an enhanced trial focusing on critical criteria by giving GPT detailed, step-by-step instructions for each of the SRs using chain-of-thought method. Accuracy, sensitivity, specificity, and F1 score for each GPT trial were calculated against human results. Internal validity among three trials were computed. RESULTS: GPT had an overall congruence of 75%, with 77% in critical criteria and 73% in non-critical criteria when compared to human results. The average F1 score was 0.66. There was a high internal validity at 85% among three iterations. GPT accurately assigned 89% of studies into the correct overall category. When given specific, step-by-step instructions, congruence of critical criteria improved to 91%, and overall quality assessment accuracy to 93%. CONCLUSIONS: GPT showed promising ability to efficiently and accurately assess the quality of SRs in urology.
Ähnliche Arbeiten
The PRISMA 2020 statement: an updated guideline for reporting systematic reviews
2021 · 90.873 Zit.
Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement
2009 · 83.099 Zit.
The Measurement of Observer Agreement for Categorical Data
1977 · 78.079 Zit.
Preferred Reporting Items for Systematic Reviews and Meta-Analyses: The PRISMA Statement
2009 · 63.605 Zit.
Measuring inconsistency in meta-analyses
2003 · 62.251 Zit.