Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Collaborative and privacy-enhancing workflows on a clinical data warehouse: an example developing natural language processing pipelines to detect medical conditions
22
Zitationen
8
Autoren
2024
Jahr
Abstract
OBJECTIVE: To develop and validate a natural language processing (NLP) pipeline that detects 18 conditions in French clinical notes, including 16 comorbidities of the Charlson index, while exploring a collaborative and privacy-enhancing workflow. MATERIALS AND METHODS: The detection pipeline relied both on rule-based and machine learning algorithms, respectively, for named entity recognition and entity qualification, respectively. We used a large language model pre-trained on millions of clinical notes along with annotated clinical notes in the context of 3 cohort studies related to oncology, cardiology, and rheumatology. The overall workflow was conceived to foster collaboration between studies while respecting the privacy constraints of the data warehouse. We estimated the added values of the advanced technologies and of the collaborative setting. RESULTS: The pipeline reached macro-averaged F1-score positive predictive value, sensitivity, and specificity of 95.7 (95%CI 94.5-96.3), 95.4 (95%CI 94.0-96.3), 96.0 (95%CI 94.0-96.7), and 99.2 (95%CI 99.0-99.4), respectively. F1-scores were superior to those observed using alternative technologies or non-collaborative settings. The models were shared through a secured registry. CONCLUSIONS: We demonstrated that a community of investigators working on a common clinical data warehouse could efficiently and securely collaborate to develop, validate and use sensitive artificial intelligence models. In particular, we provided an efficient and robust NLP pipeline that detects conditions mentioned in clinical notes.
Ähnliche Arbeiten
"Why Should I Trust You?"
2016 · 14.785 Zit.
Coding Algorithms for Defining Comorbidities in ICD-9-CM and ICD-10 Administrative Data
2005 · 10.554 Zit.
A Comprehensive Survey on Graph Neural Networks
2020 · 8.982 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.591 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 8.114 Zit.
Autoren
Institutionen
- Assistance Publique – Hôpitaux de Paris(FR)
- Inserm(FR)
- Sorbonne Université(FR)
- Institut Pierre Louis d‘Épidémiologie et de Santé Publique(FR)
- Bicêtre Hospital(FR)
- Université Paris Cité(FR)
- Université Sorbonne Paris Nord(FR)
- Laboratoire d'Informatique Médicale et d'Ingénierie des Connaissances en e-Santé
- Hôpital Albert-Chenevier(FR)