🤖 AI Summary
Deep learning models for automatic clinical coding (e.g., ICD coding) suffer from poor interpretability, undermining clinical trustworthiness. Existing interpretability studies rely heavily on attention mechanisms and lack high-quality rationale datasets and dedicated rationale generation methods.
Method: We propose (1) a fine-grained, high-density ICD code rationale annotation dataset; (2) a dual-perspective evaluation framework assessing both *credibility* (clinical plausibility) and *faithfulness* (fidelity to model behavior); and (3) the first use of large language models (LLMs) to generate rationales as distant supervision signals, enabling few-shot guided supervised learning with minimal human annotations.
Results: LLM-generated rationales achieve high agreement with expert judgments and significantly outperform baselines in both faithfulness and plausibility. Moreover, integrating these rationales improves downstream coding accuracy, demonstrating their utility for enhancing model transparency and performance.
📝 Abstract
Automated clinical coding involves mapping unstructured text from Electronic Health Records (EHRs) to standardized code systems such as the International Classification of Diseases (ICD). While recent advances in deep learning have significantly improved the accuracy and efficiency of ICD coding, the lack of explainability in these models remains a major limitation, undermining trust and transparency. Current explorations about explainability largely rely on attention-based techniques and qualitative assessments by physicians, yet lack systematic evaluation using consistent criteria on high-quality rationale datasets, as well as dedicated approaches explicitly trained to generate rationales for further enhancing explanation. In this work, we conduct a comprehensive evaluation of the explainability of the rationales for ICD coding through two key lenses: faithfulness that evaluates how well explanations reflect the model's actual reasoning and plausibility that measures how consistent the explanations are with human expert judgment. To facilitate the evaluation of plausibility, we construct a new rationale-annotated dataset, offering denser annotations with diverse granularity and aligns better with current clinical practice, and conduct evaluation across three types of rationales of ICD coding. Encouraged by the promising plausibility of LLM-generated rationales for ICD coding, we further propose new rationale learning methods to improve the quality of model-generated rationales, where rationales produced by prompting LLMs with/without annotation examples are used as distant supervision signals. We empirically find that LLM-generated rationales align most closely with those of human experts. Moreover, incorporating few-shot human-annotated examples not only further improves rationale generation but also enhances rationale-learning approaches.