🤖 AI Summary
This work addresses the dependency of task-specific neural network fine-tuning on labeled task data. We propose a diffusion-based parameter generation paradigm that directly synthesizes network weights from task identifiers—such as textual descriptions or task IDs—thereby eliminating the need for task-specific annotations. Our core contribution is the first application of diffusion models to explicitly model the neural network parameter space, establishing a task-conditional generative framework that learns the mapping from task identifiers to distributions over parameters. The method enables high-fidelity reconstruction of parameters for seen tasks and supports smooth, continuous interpolation across multiple tasks in parameter space. Experiments demonstrate accurate parameter generation and stable interpolation for seen tasks; however, generalization to unseen tasks remains limited, exposing a fundamental challenge regarding the transferability and structure of neural parameter spaces. This reveals new insights into the geometric and statistical properties governing parameter-space generalization.
📝 Abstract
Adapting neural networks to new tasks typically requires task-specific fine-tuning, which is time-consuming and reliant on labeled data. We explore a generative alternative that produces task-specific parameters directly from task identity, eliminating the need for task-specific training. To this end, we propose using diffusion models to learn the underlying structure of effective task-specific parameter space and synthesize parameters on demand. Once trained, the task-conditioned diffusion model can generate specialized weights directly from task identifiers. We evaluate this approach across three scenarios: generating parameters for a single seen task, for multiple seen tasks, and for entirely unseen tasks. Experiments show that diffusion models can generate accurate task-specific parameters and support multi-task interpolation when parameter subspaces are well-structured, but fail to generalize to unseen tasks, highlighting both the potential and limitations of this generative solution.