CTPD: Cross Tokenizer Preference Distillation

📅 2026-01-17

📈 Citations: 0

✨ Influential: 0

career value

133K/year

🤖 AI Summary

This work addresses the challenge of fine-grained preference alignment in knowledge distillation when teacher and student language models employ heterogeneous tokenizers. To overcome this, the authors propose CTPD, a unified framework that enables cross-tokenizer semantic mapping through character-level alignment. The approach introduces three key innovations: aligned span projection, Token Importance Sampling-based Direct Preference Optimization (TIS-DPO), and a teacher-anchored reference mechanism. CTPD represents the first method to achieve efficient and generalizable transfer of human preferences across disparate tokenization schemes. Extensive experiments demonstrate its significant superiority over existing approaches on multiple benchmarks, confirming its effectiveness and scalability in aligning preferences between heterogeneous language models.

Technology Category

Application Category

📝 Abstract

While knowledge distillation has seen widespread use in pre-training and instruction tuning, its application to aligning language models with human preferences remains underexplored, particularly in the more realistic cross-tokenizer setting. The incompatibility of tokenization schemes between teacher and student models has largely prevented fine-grained, white-box distillation of preference information. To address this gap, we propose Cross-Tokenizer Preference Distillation (CTPD), the first unified framework for transferring human-aligned behavior between models with heterogeneous tokenizers. CTPD introduces three key innovations: (1) Aligned Span Projection, which maps teacher and student tokens to shared character-level spans for precise supervision transfer; (2) a cross-tokenizer adaptation of Token-level Importance Sampling (TIS-DPO) for improved credit assignment; and (3) a Teacher-Anchored Reference, allowing the student to directly leverage the teacher's preferences in a DPO-style objective. Our theoretical analysis grounds CTPD in importance sampling, and experiments across multiple benchmarks confirm its effectiveness, with significant performance gains over existing methods. These results establish CTPD as a practical and general solution for preference distillation across diverse tokenization schemes, opening the door to more accessible and efficient alignment of language models.

Problem

Research questions and friction points this paper is trying to address.

preference distillation

cross-tokenizer

language model alignment

tokenization incompatibility

human preference

Innovation

Methods, ideas, or system contributions that make the work stand out.

Cross-Tokenizer Preference Distillation

Aligned Span Projection

Token-level Importance Sampling