OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 06.05.2026, 04:24

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

A Recursive Learning Architecture for Zero-Shot Automated Clinical Coding, a methodological study (Preprint)

2026·0 ZitationenOpen Access
Volltext beim Verlag öffnen

0

Zitationen

5

Autoren

2026

Jahr

Abstract

<sec> <title>BACKGROUND</title> Automated clinical coding with large language models has shown promise, but most approaches depend on supervised fine-tuning, static label spaces, or opaque prediction mechanisms that are difficult to audit and update. These limitations are particularly relevant in ICD-10-CM coding, where models must navigate complex documentation patterns, ambiguity, and evolving coding rules. Recursive learning architectures may offer an alternative by enabling systems to improve through explicit natural-language memory rather than parameter updates. </sec> <sec> <title>OBJECTIVE</title> This study evaluated whether a recursive learning architecture with an externalized Learning File could improve zero-shot ICD-10-CM coding performance on discharge summaries, while preserving interpretability and enabling analysis of longitudinal learning dynamics. </sec> <sec> <title>METHODS</title> We developed PANDORA, a zero-shot coding system composed of a Coder, a Reviewer, and a persistent natural-language Learning File derived from prior coding errors. Using discharge summaries from MIMIC-IV and a Top-50 ICD-10-CM benchmark, we compared a no-memory baseline (Phase 1) against a memory-augmented configuration (Phase 4). Performance was assessed across 20 recursive training iterations and on a held-out testing set of 500 cases, using micro-F1, macro-F1, precision, and recall at both exact-code and ICD-3 levels. Error composition, representative memory-guided decisions, and temporal degradation associated with memory growth were also analyzed. </sec> <sec> <title>RESULTS</title> In the held-out testing set, the memory-augmented system improved exact-code micro-F1 from 0.307 to 0.527 and precision from 0.203 to 0.515, while recall decreased from 0.630 to 0.540. At the ICD-3 level, micro-F1 improved from 0.372 to 0.560. Across training iterations, the memory-augmented condition achieved an exact-code micro-F1 of 0.605 versus 0.318 in the no-memory baseline. Gains were driven primarily by large reductions in false positives, indicating that the Learning File improved precision more than recall. A qualitative review showed that the system used accumulated rules to suppress unsupported codes and to recover context-sensitive diagnoses. However, performance declined after iteration 10 as the Learning File grew larger and less discriminative, suggesting that memory bloat is an important failure mode of recursive learning. </sec> <sec> <title>CONCLUSIONS</title> A recursive learning architecture with explicit natural-language memory substantially improved zero-shot ICD-10-CM coding performance, primarily through better precision and more controlled code assignment. The approach offers transparency benefits because improvements can be traced to human-readable learned rules rather than hidden parameter changes. However, recursive systems require active memory governance, as unchecked rule accumulation may degrade performance over time. These findings support memory-based adaptation as a promising direction for interpretable clinical coding systems and other high-stakes clinical NLP tasks. </sec>

Ähnliche Arbeiten

Autoren

Themen

Medical Coding and Health InformationMachine Learning in HealthcareArtificial Intelligence in Healthcare and Education
Volltext beim Verlag öffnen