Probability-Entropy Calibration: An Elastic Indicator for Adaptive Fine-tuning

📅 2026-02-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitations of existing fine-tuning reweighting methods that rely solely on a single metric—such as ground-truth label probability or token entropy—which are prone to noise or pretraining priors and thus struggle to accurately identify tokens requiring focused learning. To overcome this, the authors propose a joint calibration mechanism that dynamically assesses each token’s learning status by leveraging the relative ranks of ground-truth probability and predictive distribution entropy. Based on this assessment, an adaptive reweighting strategy is devised to prioritize under-learned yet high-certainty tokens while preserving the model’s ability to capture uncertainty. Experiments demonstrate consistent and significant improvements across multiple backbone models in mathematical reasoning, out-of-distribution generalization, and code generation, outperforming baselines that use only probability or entropy alone.

Technology Category

Application Category

📝 Abstract
Token-level reweighting is a simple yet effective mechanism for controlling supervised fine-tuning, but common indicators are largely one-dimensional: the ground-truth probability reflects downstream alignment, while token entropy reflects intrinsic uncertainty induced by the pre-training prior. Ignoring entropy can misidentify noisy or easily replaceable tokens as learning-critical, while ignoring probability fails to reflect target-specific alignment. RankTuner introduces a probability--entropy calibration signal, the Relative Rank Indicator, which compares the rank of the ground-truth token with its expected rank under the prediction distribution. The inverse indicator is used as a token-wise Relative Scale to reweight the fine-tuning objective, focusing updates on truly under-learned tokens without over-penalizing intrinsically uncertain positions. Experiments on multiple backbones show consistent improvements on mathematical reasoning benchmarks, transfer gains on out-of-distribution reasoning, and pre code generation performance over probability-only or entropy-only reweighting baselines.
Problem

Research questions and friction points this paper is trying to address.

token-level reweighting
probability-entropy calibration
supervised fine-tuning
intrinsic uncertainty
downstream alignment
Innovation

Methods, ideas, or system contributions that make the work stand out.

Probability-Entropy Calibration
Relative Rank Indicator
Token-level Reweighting
Adaptive Fine-tuning
RankTuner
🔎 Similar Papers
No similar papers found.