Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Can generative AI reliably synthesise literature? exploring hallucination issues in ChatGPT
19
Zitationen
2
Autoren
2025
Jahr
Abstract
Abstract This study evaluates the capabilities and limitations of generative AI, specifically ChatGPT, in conducting systematic literature reviews. Using the PRISMA methodology, we analysed 124 recent studies, focusing in-depth on a subset of 40 selected through strict inclusion criteria. Findings show that ChatGPT can enhance efficiency, with reported workload reductions averaging around 60–65%, though accuracy varies widely by task and context. In structured domains such as clinical research, title and abstract screening sensitivity ranged from 80.6% to 96.2%, while precision dropped as low as 4.6% in more interpretive tasks. Hallucination rates reached 91%, underscoring the need for careful oversight. Comparative analysis shows that AI matches or exceeds human performance in simple screening but underperforms in nuanced synthesis. To support more reliable integration, we introduce the Systematic Research Processing Framework (SRPF) as a guiding model for hybrid AI–human collaboration in research review workflows.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.316 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.177 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.575 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.468 Zit.