Reactivation: Empirical NTK Dynamics Under Task Shifts

📅 2025-07-21

📈 Citations: 0

✨ Influential: 0

career value

219K/year

🤖 AI Summary

Neural Tangent Kernel (NTK) theory has been largely confined to static, single-task settings, leaving its behavior under non-stationary, multi-task continual learning poorly understood. Method: This work systematically investigates the temporal evolution of the NTK during sequential task learning in deep neural networks. By tracking training trajectories and quantitatively measuring NTK matrix dynamics across task transitions, we analyze how the kernel changes over time. Contribution/Results: We empirically demonstrate that task switches induce significant, persistent NTK shifts—termed “kernel drift”—which are robust across large-scale models and datasets. Consequently, the conventional static NTK approximation fails in continual learning, revealing strong task dependence in feature learning. Moreover, NTK dynamics themselves serve as a novel, interpretable metric for quantifying catastrophic forgetting and knowledge transfer capability. This study provides the first empirical foundation and a new analytical framework for extending NTK theory to non-stationary learning scenarios.

Technology Category

Application Category

📝 Abstract

The Neural Tangent Kernel (NTK) offers a powerful tool to study the functional dynamics of neural networks. In the so-called lazy, or kernel regime, the NTK remains static during training and the network function is linear in the static neural tangents feature space. The evolution of the NTK during training is necessary for feature learning, a key driver of deep learning success. The study of the NTK dynamics has led to several critical discoveries in recent years, in generalization and scaling behaviours. However, this body of work has been limited to the single task setting, where the data distribution is assumed constant over time. In this work, we present a comprehensive empirical analysis of NTK dynamics in continual learning, where the data distribution shifts over time. Our findings highlight continual learning as a rich and underutilized testbed for probing the dynamics of neural training. At the same time, they challenge the validity of static-kernel approximations in theoretical treatments of continual learning, even at large scale.

Problem

Research questions and friction points this paper is trying to address.

Study NTK dynamics in continual learning settings

Challenge static-kernel approximations in continual learning

Explore feature learning under shifting data distributions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Empirical NTK dynamics in continual learning

Challenges static-kernel approximations validity

Probes neural training dynamics shifts

🔎 Similar Papers

Towards Understanding Multi-Task Learning (Generalization) of LLMs via Detecting and Exploring Task-Specific Neurons