Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
A retrieval-augmented generation large language model framework for accurate dementia identification from electronic health records
0
Zitationen
8
Autoren
2026
Jahr
Abstract
Objective: Accurate and scalable dementia phenotyping from electronic health records (EHRs) is foundational for population-level research, risk prediction, and learning health system interventions. Traditional rule- and keyword-based approaches are limited by inconsistent documentation and inability to capture clinical nuance. We aim to develop and evaluate a framework that leverages large language models (LLMs) with retrieval-augmented generation (RAG) to overcome these limitations and improve dementia identification from real-world EHR data. Methods: Using EHR data from the Mass General Brigham health system, we first assembled a cohort of adults with potential dementia based on diagnosis codes, problem lists, dementia-related medications, and free-text note mentions. A subset of candidate cases underwent detailed manual chart review to assign gold-standard dementia status. With this labeled sample, we implemented and compared three approaches for dementia ascertainment: (1) a rule-based classifier leveraging structured EHR data, (2) large language models (LLMs) applied to keyword-filtered clinical note excerpts, and (3) a RAG-based LLM framework that integrates retrieved, context-rich note snippets. Within each approach, we evaluated multiple configurations of embedding models, retrieval methods, LLMs, structured-data inclusion, and prompts to identify the best-performing classifier. Performance was assessed using standard classification metrics, including sensitivity, specificity, positive predictive value (PPV), and F1 score, and supplemented by qualitative error analyses to characterize common sources of false positives and false negatives across methods. Results: The RAG-based classifier achieved the highest performance (F1=0.933, sensitivity=91.1%, PPV=95.5%) compared to rule-based (F1=0.823, sensitivity=81.1%, PPV=83.5%) and keyword-filtered LLM (F1=0.903, sensitivity=91.7%, PPV=88.6%). Including ICD codes alongside free text in the RAG-based LLM pipeline significantly reduced the PPV and modestly decreased F-1 score. Error analysis revealed that structured-code dependence contributed to false positives, whereas unrecognized contextual cues in notes drove false negatives. Conclusion: A RAG-based LLM pipeline without structured ICD codes improved dementia ascertainment from EHR data compared with ICD-based rules and keyword-based filtering. This approach can enhance dementia case identification and support patient care, predictive modeling and risk analysis.
Ähnliche Arbeiten
"Why Should I Trust You?"
2016 · 14.889 Zit.
Coding Algorithms for Defining Comorbidities in ICD-9-CM and ICD-10 Administrative Data
2005 · 10.579 Zit.
A Comprehensive Survey on Graph Neural Networks
2020 · 9.018 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.674 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 8.234 Zit.