The Master Key Hypothesis: Unlocking Cross-Model Capability Transfer via Linear Subspace Alignment

📅 2026-04-07

📈 Citations: 0

✨ Influential: 0

career value

193K/year

🤖 AI Summary

This work addresses the challenge of transferring reasoning capabilities from large language models to smaller ones without retraining or reliance on labeled data. It introduces the “Master Key Hypothesis,” positing that model abilities are encoded along specific directions within a low-dimensional latent subspace. Building on this insight, the authors propose the UNLOCK framework, which extracts these capability directions via activation contrast, aligns subspaces across models of different scales using low-rank linear transformations, and injects the identified directions during inference to unlock latent reasoning abilities in the target small model. Experiments demonstrate substantial performance gains: for instance, Qwen1.5-7B achieves a 12.1% accuracy improvement on MATH and AGIEval Math benchmarks, while Qwen3-14B-Base even surpasses its post-trained counterpart.

Technology Category

Application Category

📝 Abstract

We investigate whether post-trained capabilities can be transferred across models without retraining, with a focus on transfer across different model scales. We propose the Master Key Hypothesis, which states that model capabilities correspond to directions in a low-dimensional latent subspace that induce specific behaviors and are transferable across models through linear alignment. Based on this hypothesis, we introduce UNLOCK, a training-free and label-free framework that extracts a capability direction by contrasting activations between capability-present and capability-absent Source variants, aligns it with a Target model through a low-rank linear transformation, and applies it at inference time to elicit the behavior. Experiments on reasoning behaviors, including Chain-of-Thought (CoT) and mathematical reasoning, demonstrate substantial improvements across model scales without training. For example, transferring CoT reasoning from Qwen1.5-14B to Qwen1.5-7B yields an accuracy gain of 12.1% on MATH, and transferring a mathematical reasoning direction from Qwen3-4B-Base to Qwen3-14B-Base improves AGIEval Math accuracy from 61.1% to 71.3%, surpassing the 67.8% achieved by the 14B post-trained model. Our analysis shows that the success of transfer depends on the capabilities learned during pre-training, and that our intervention amplifies latent capabilities by sharpening the output distribution toward successful reasoning trajectories.

Problem

Research questions and friction points this paper is trying to address.

capability transfer

cross-model

linear subspace alignment

post-trained capabilities

model scaling

Innovation

Methods, ideas, or system contributions that make the work stand out.

Master Key Hypothesis

capability transfer

linear subspace alignment