Neural Incompatibility: The Unbridgeable Gap of Cross-Scale Parametric Knowledge Transfer in Large Language Models

📅 2025-05-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper identifies “neural incompatibility”—a fundamental structural and behavioral mismatch between models of differing scales—as the core bottleneck hindering cross-scale parameter knowledge transfer (PKT) among large language models (LLMs). To address this, we formally distinguish two PKT paradigms—Post-Align and Pre-Align—and propose LaTen, a lightweight alignment method that achieves cross-scale parameter-space alignment in just a few training steps. Through rigorous parameter-space analysis, LoRA initialization studies, pre-aligned PKT (PrePKT), and empirical evaluation across four standard benchmarks, we demonstrate the inherent instability of PKT and confirm neural incompatibility as the dominant limiting factor. LaTen significantly reduces downstream fine-tuning costs and mitigates performance volatility, establishing a novel, scalable paradigm for efficient cross-scale knowledge reuse.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) offer a transparent brain with accessible parameters that encode extensive knowledge, which can be analyzed, located and transferred. Consequently, a key research challenge is to transcend traditional knowledge transfer paradigms rooted in symbolic language and achieve genuine Parametric Knowledge Transfer (PKT). Significantly, exploring effective methods for transferring knowledge across LLMs of different scales through parameters presents an intriguing and valuable research direction. In this paper, we first demonstrate $ extbf{Alignment}$ in parametric space is the fundamental prerequisite to achieve successful cross-scale PKT. We redefine the previously explored knowledge transfer as Post-Align PKT (PostPKT), which utilizes extracted parameters for LoRA initialization and requires subsequent fine-tune for alignment. Hence, to reduce cost for further fine-tuning, we introduce a novel Pre-Align PKT (PrePKT) paradigm and propose a solution called $ extbf{LaTen}$ ($ extbf{L}$oc$ extbf{a}$te-$ extbf{T}$h$ extbf{e}$n-Alig$ extbf{n}$) that aligns the parametric spaces of LLMs across scales only using several training steps without following training. Comprehensive experiments on four benchmarks demonstrate that both PostPKT and PrePKT face challenges in achieving consistently stable transfer. Through in-depth analysis, we identify $ extbf{Neural Incompatibility}$ as the ethological and parametric structural differences between LLMs of varying scales, presenting fundamental challenges to achieving effective PKT. These findings provide fresh insights into the parametric architectures of LLMs and highlight promising directions for future research on efficient PKT. Our code is available at https://github.com/Trae1ounG/Neural_Incompatibility.
Problem

Research questions and friction points this paper is trying to address.

Achieving cross-scale Parametric Knowledge Transfer in LLMs
Aligning parametric spaces of different-scale LLMs efficiently
Addressing Neural Incompatibility in parametric knowledge transfer
Innovation

Methods, ideas, or system contributions that make the work stand out.

Alignment in parametric space enables cross-scale PKT
LaTen aligns parametric spaces without extensive training
Neural Incompatibility challenges stable parametric knowledge transfer
🔎 Similar Papers
No similar papers found.
Yuqiao Tan
Yuqiao Tan
Institute of Automation, Chinese Academy of Sciences
LLMs ReasoningLLMs Interpretability
S
Shizhu He
The Key Laboratory of Cognition and Decision Intelligence for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, China; School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China
K
Kang Liu
The Key Laboratory of Cognition and Decision Intelligence for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, China; School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China
J
Jun Zhao
The Key Laboratory of Cognition and Decision Intelligence for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, China; School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China