OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 06.04.2026, 10:06

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Understanding Surgical Complications in Clinical Text: Automated Clavien–Dindo Grading using a Zero-Shot Large Language Model Approach in a Collective of Liver Surgery Patients (Preprint)

2026·0 ZitationenOpen Access
Volltext beim Verlag öffnen

0

Zitationen

14

Autoren

2026

Jahr

Abstract

<sec> <title>BACKGROUND</title> The standardized extraction of postoperative complications from unstructured routine clinical documentation remains a major unresolved challenge in digital surgery and health informatics. Although the Clavien–Dindo classification is the established standard for grading postoperative complications, its application in routine clinical documentation is largely implicit and unstructured. </sec> <sec> <title>OBJECTIVE</title> To assess the capability of open-weight and proprietary large language models (LLMs) to classify postoperative complications according to the Clavien–Dindo system using discharge letters, benchmarked against expert consensus annotations. </sec> <sec> <title>METHODS</title> We analyzed discharge letters from 650 surgical cases of patients who underwent hepatobiliary surgery between 2010 and 2024. The cohort included Grade I–II complications in 24%, Grade III–IV in 19%, and Grade V (death) in 6% of patients. Representative open-weight (Qwen 3, Llama 3.3, Ministral 3, GPT-OSS) and proprietary (GPT 5.1, Gemini 3 Pro) LLMs were prompted to infer complication grades directly from the discharge letters. Model performance was evaluated against expert assessment using accuracy and a detailed deviation analysis. </sec> <sec> <title>RESULTS</title> All models were capable of identifying and classifying complications from the unstructured documentation in the discharge letters. On the full 650-case dataset, open-weight models achieved accuracies up to 0.775 for fine-grading prediction and 0.94 for binary classification. On a balanced 50-case subset, proprietary models achieved the highest performance, with accuracies of 0.78 for fine-grading and 0.98 for binary classification. In contrast, open-weight models reached accuracies up to 0.70 in fine-grading and 0.94 for binary grading, with lower computational requirements. An ensemble approach yielded additional gains in classification performance. </sec> <sec> <title>CONCLUSIONS</title> LLMs can accurately classify postoperative complications from discharge letters, enabling scalable and objective monitoring of surgical outcomes. Their use may reduce manual abstraction workload and promote consistent, data-driven quality assessment in surgical care. </sec>

Ähnliche Arbeiten