🤖 AI Summary
Existing LLM-based table classification methods suffer from high computational overhead, suboptimal in-context example selection, and insufficient interpretability. To address these limitations, we propose the first explanation-driven, three-stage *post-hoc* in-context learning framework: (1) an LLM generates structured reasoning paths for input tables; (2) semantically similar, highly relevant in-context examples are retrieved via embedding-based similarity matching; and (3) an explanation-guided lightweight surrogate language model (SLM) performs interpretable classification via fine-tuning or prompt optimization. Our paradigm integrates post-hoc explanation generation, semantic-aware retrieval, and SLM adaptation—eliminating the need for end-to-end LLM inference. Evaluated across diverse domain-specific table datasets, our approach achieves a 5.31% average accuracy improvement over strong baselines, while significantly enhancing inference efficiency and decision transparency. This work establishes a novel, resource-efficient paradigm for trustworthy table classification in constrained deployment scenarios.
📝 Abstract
Large Language Models (LLMs) have shown remarkable ability in solving complex tasks, making them a promising tool for enhancing tabular learning. However, existing LLM-based methods suffer from high resource requirements, suboptimal demonstration selection, and limited interpretability, which largely hinder their prediction performance and application in the real world. To overcome these problems, we propose a novel in-context learning framework for tabular prediction. The core idea is to leverage the explanations generated by LLMs to guide a smaller, locally deployable Surrogate Language Model (SLM) to make interpretable tabular predictions. Specifically, our framework mainly involves three stages: (i) Post Hoc Explanation Generation, where LLMs are utilized to generate explanations for question-answer pairs in candidate demonstrations, providing insights into the reasoning behind the answer. (ii) Post Hoc Explanation-Guided Demonstrations Selection, which utilizes explanations generated by LLMs to guide the process of demonstration selection from candidate demonstrations. (iii) Post Hoc Explanation-Guided Interpretable SLM Prediction, which utilizes the demonstrations obtained in step (ii) as in-context and merges corresponding explanations as rationales to improve the performance of SLM and guide the model to generate interpretable outputs. Experimental results highlight the framework's effectiveness, with an average accuracy improvement of 5.31% across various tabular datasets in diverse domains.