🤖 AI Summary
This work addresses the limitations of large language models (LLMs) in complex text classification—namely, weak reasoning capabilities, poor interpretability, and the trade-off between performance and scalability—by proposing the eXTC framework. eXTC uniquely integrates structured prompt optimization, knowledge distillation, and reinforcement learning to guide LLMs in learning from natural language standard operating procedures (SOPs), distilling their reasoning abilities into lightweight models while enhancing generalization. The framework delivers dual interpretability through both global modular rules and local reasoning trajectories. Experimental results demonstrate that eXTC significantly outperforms existing methods across multiple benchmarks, achieving state-of-the-art performance in both classification accuracy and explanation quality, while maintaining efficient inference and high transparency.
📝 Abstract
LLMs have advanced text classification, yet existing paradigms face a trade-off: supervised (label only) fine-tuning is scalable but offers limited reasoning on complex text and lacks broader model transparency, while discrete prompt optimization offers human-readable instructions but struggles with performance and scalability. We introduce eXTC (eXplainable Text Classifier) with three progressive stages: (1) learning a Standard Operating Procedure (SOP, or rulebook) in natural language via a new Structured Prompt Optimization algorithm; (2) SOP-grounded reasoning distillation from a large teacher LLM into a compact LM; and (3) expanding reasoning capabilities beyond the initial SOP via reinforcement learning. This design enables eXTC to provide (i) fast inference via a compact LM, with (ii) inference-time local reasoning traces, alongside a global, modular explanation of its learned domain rules, while (iii) significantly outperforming existing paradigms across diverse benchmarks in both classification performance and explanation quality, with stage-by-stage gains.