Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

1045: USING LARGE LANGUAGE MODELS TO EXPLAIN ALERTS FROM A PEDIATRIC RISK STRATIFICATION MODEL

2026·0 Zitationen·Critical Care Medicine

Volltext beim Verlag öffnen

Zitationen

Autoren

2026

Jahr

Abstract

Introduction: Hospitalized children who experience critical events, such as intubation or administration of vasoactive drugs, are at increased risk for mortality and morbidity. We recently developed pCREST, a machine learning model that continuously predicts the risk of pediatric critical events across ED, ward, and ICU settings (Strutz et al., JAMA Network Open, 2025). Here, we develop new Large Language Model (LLM)-based algorithms that use a patient’s EHR data to generate text-based explanations for pCREST alerts. Methods: We conducted a retrospective analysis of pediatric admissions to the University of Wisconsin-Madison (2009-2020). The pCREST score cutoff of 90th percentile was used to identify children at risk for critical events within the next 12 hours. We developed two Mixtral 8x7B-Instruct LLMs to generate explanations for at-risk alerts. The first LLM (LLM-1 + Transformer) summarized unstructured notes documented in the 6 hours preceding a pCREST score, and a transformer model with label-aware attention was trained to identify key phrases that were important to pCREST alert generation. The second LLM (LLM-2) was prompt-engineered to generate text summaries from vitals and labs within the prior 6 hours of a pCREST score, with explicit instructions to identify signs of clinical deterioration. An o3-Mini was used as a judge to evaluate LLM-1 summaries by comparing against notes using the validated PDSQI-9 tool on a 5-point Likert scale. Results: Among 40,498 admissions, 7,436 (18.36%) had at least one at-risk pCREST alert during their stay. LLM-1 outputs scored high on the PDSQI-9, with median (IQR) scores ranging from 4 (4, 5) to 5 (4, 5) across attributes of accuracy, thoroughness, usefulness, organization, and comprehensibility. The transformer model identified clinically relevant terms, such as “aortic,” “lethargy,” and “tachypnea” as being important for pCREST alerts. For a 2-year-old ICU sample patient, LLM-2 indicated a “concerning spike in respiratory rate (37.0 to 42.0)”, which matched with Shapley-based analysis identifying the respiratory rate of 42 as important for the pCREST alert. Conclusions: Our study lays the foundation for LLM-based pipelines to generate explainable summaries for hospitalized children identified as high-risk, potentially improving clinical decision-making.

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationMachine Learning in HealthcareClinical Reasoning and Diagnostic Skills

Volltext beim Verlag öffnen

1045: USING LARGE LANGUAGE MODELS TO EXPLAIN ALERTS FROM A PEDIATRIC RISK STRATIFICATION MODEL

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen