🤖 AI Summary
This work addresses key challenges in applying large language models (LLMs) to ICD coding—namely, insufficient training data coverage, poor interpretability, and high computational costs associated with long clinical documents. To overcome these limitations, the authors propose a code-centric learning framework that shifts supervision from full clinical notes to short, evidence-based text snippets, enabling snippet-level training to enhance document-level coding performance. By incorporating code-centric data augmentation and a hybrid fine-tuning strategy, the method significantly reduces training overhead while improving generalization to unseen ICD codes and preserving decision interpretability. Experimental results demonstrate that, using the same LLM backbone, the proposed approach substantially outperforms strong baselines, enabling small open-source models to achieve coding accuracy comparable to that of large proprietary models.
📝 Abstract
ICD coding is a critical yet challenging task in healthcare. Recently, LLM-based methods demonstrate stronger generalization than discriminative methods in ICD coding. However, fine-tuning LLMs for ICD coding faces three major challenges. First, existing public ICD coding datasets provide limited coverage of the ICD code space, restricting a model's ability to generalize to unseen codes. Second, naive fine-tuning diminishes the interpretability of LLMs, as few public datasets contain explicit supporting evidence for assigned codes. Third, ICD coding typically involves long clinical documents, making fine-tuning LLMs computationally expensive. To address these issues, we propose Code-Centric Learning, a training framework that shifts supervision from full clinical documents to scalable, short evidence spans. The key idea of this framework is that span-level learning improves LLMs' ability to perform document-level ICD coding. Our proposed framework consists of a mixed training strategy and code-centric data expansion, which substantially reduces training cost, improves accuracy on unseen ICD codes and preserves interpretability. Under the same LLM backbone, our method substantially outperforms strong baselines. Notably, our method enables small-scale LLMs to achieve performance comparable to much larger proprietary models, demonstrating its effectiveness and potential for fully automated ICD coding.