🤖 AI Summary
This paper addresses incremental sequence classification—dynamically updating predictions as sequence elements arrive sequentially while enforcing temporal consistency. To overcome low data efficiency and weak early discrimination under fine-grained (e.g., token-level) observations, we propose the first differentiable, optimization-friendly consistency signal for temporal modeling. Inspired by temporal-difference learning in reinforcement learning, we introduce a novel consistency loss that explicitly regularizes the plausible evolution of predictions between adjacent time steps. Our method enables end-to-end differentiable incremental classification. Experiments demonstrate significant improvements over state-of-the-art incremental baselines across multiple text classification benchmarks. In a quality verification task for large language model outputs on elementary mathematics problems, our approach achieves high-accuracy discrimination of generation quality using only 3–5 observed tokens—substantially enhancing early decision-making capability and data utilization efficiency.
📝 Abstract
We address the problem of incremental sequence classification, where predictions are updated as new elements in the sequence are revealed. Drawing on temporal-difference learning from reinforcement learning, we identify a temporal-consistency condition that successive predictions should satisfy. We leverage this condition to develop a novel loss function for training incremental sequence classifiers. Through a concrete example, we demonstrate that optimizing this loss can offer substantial gains in data efficiency. We apply our method to text classification tasks and show that it improves predictive accuracy over competing approaches on several benchmark datasets. We further evaluate our approach on the task of verifying large language model generations for correctness in grade-school math problems. Our results show that models trained with our method are better able to distinguish promising generations from unpromising ones after observing only a few tokens.