Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

A self-supervised framework for laboratory data imputation in electronic health records

2025·1 Zitationen·Communications MedicineOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

Laboratory data in electronic health records (EHRs) is an effective source of information to characterize patient populations, inform accurate diagnostics and treatment decisions, and fuel research studies. However, despite their value, laboratory values are underutilized due to high levels of missingness. Existing imputation methods fall short, as they do not fully leverage patient clinical histories and are commonly not scalable to the large number of tests available in real-world data (RWD). To address these shortcomings, we present Laboratory Imputation Framework for EHRs (LIFE), a self-supervised learning framework based on multi-head attention that is trained to impute any laboratory test value at any point in time in the patient’s journey using their complete EHRs. This architecture (1) eliminates the need to train a different model for each laboratory test by jointly modeling all laboratory data of interest; and (2) better clinically contextualizes the predictions by leveraging additional EHR variables, such as diagnosis, medications, and discrete laboratory results. We validate our framework using a large-scale, real-world dataset encompassing over 1 million oncology patients. Our results demonstrate that LIFE obtains superior or equivalent results compared to state-of-the-art baseline methods in 23 out of 25 evaluated laboratory tests and better enhances a downstream adverse event detection task in 7 out of 9 cases. LIFE shows promise in accurately estimating missing laboratory values and enhancing the utilization of large-scale RWD in healthcare. This advancement could lead to better clinical models, more informed decision-making and improved patient outcomes. Electronic health records (EHRs) contain laboratory test results that are crucial for modeling and analyzing patient health and outcomes. However, many laboratory test results are often missing, which limits their usefulness. Current automated methods to fill these gaps are not very effective because they generally do not utilize all available patient clinical information and cannot handle a wide variety of tests. To address this issue, we present Laboratory Imputation Framework for EHRs (LIFE), a model that predicts missing laboratory results at any point in time by analyzing a patient’s entire health record, including diagnoses and medications. Tested on data from over a million cancer patients, LIFE outperformed other methods in predicting laboratory results and improved the detection of several clinical adverse events. This tool could lead to better clinical models, potentially enhancing healthcare decisions and improving patient outcomes. Heilbroner et al. present LIFE, a self-supervised learning framework for imputing laboratory test values at any point in a patient’s journey using electronic health records. LIFE generally outperforms state-of-the-art baselines and enhances adverse event detection when tested on over 1 million oncology records.

Autoren

Institutionen

Tempus Labs (United States)(US)

Themen

Machine Learning in HealthcareElectronic Health Records SystemsArtificial Intelligence in Healthcare and Education

Volltext beim Verlag öffnen

A self-supervised framework for laboratory data imputation in electronic health records

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen