OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 27.03.2026, 07:56

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Abstract 4367840: Optimizing the Accuracy of Natural Language Processing Models for Pulmonary Embolism Detection Through Integration with Claims Data: The PE-EHR+ Study

2025·0 Zitationen·Circulation
Volltext beim Verlag öffnen

0

Zitationen

18

Autoren

2025

Jahr

Abstract

Background: Rule-based natural language processing (NLP) tools are easy to implement and can identify pulmonary embolism (PE) via radiology reports, but their accuracy is limited when used in isolation, and their external validity remains uncertain. Methods: In this cross-sectional study, we analyzed data from a prespecified sample of 1,712 hospitalized patients (with and without PE) at Mass General Brigham (MGB) hospitals (2016–2021) and applied two previously published NLP algorithms (Verma et al. and Johnson et al) to radiology reports to identify PE. Chart review by two independent physicians using pre-specified criteria was the reference standard. We tested three approaches: (A) NLP applied to all patients; (B) NLP limited to patients with primary or secondary International Classification of Diseases (ICD)-10 PE discharge codes; and (C) NLP applied to patients with PE discharge codes or a Present-on-Admission (POA) indicator (“Y” or “N”) for PE. All others were assumed PE-negative in Approaches B and C to minimize false positives with NLP. Weighted estimates were derived from the full MGB hospitalized cohort (n=381,642) to calculate F1 scores that summarize model performance by combining sensitivity and positive predictive value (PPV) [F1 = 2 x (PPV x sensitivity)/ (PPV + sensitivity)]. Results: In total, 7,708 (2.0%) patients had PE. In Approach A, both NLP models showed high sensitivity (82.5%, 93.0%) and specificity (98.9%, 98.7%) but low PPV (60.3%, 59.6%) (Figure). Approach B improved PPV (95.2%, 94.9%) at the cost of reduced sensitivity (74.1%, 76.2%), while Approach C preserved both high sensitivity (82.5%, 93.0%) and PPV (95.6%, 95.8%). Approach C demonstrated the best performance, yielding significantly higher F1 scores for both NLP models (88.6%, 94.4%) compared with Approach A (69.7%, 72.6%) and Approach B (83.3%, 84.5%) (P<0.001). Conclusions: The accuracy of PE detection improves when rule-based NLP models are operationalized within a screening framework using administrative claims data in addition to radiology reports.

Ähnliche Arbeiten