Meta-Sparsity: Learning Optimal Sparse Structures in Multi-task Networks through Meta-learning

📅 2025-01-21

📈 Citations: 0

✨ Influential: 0

career value

238K/year

🤖 AI Summary

Manual design of model architectures in multi-task learning (MTL) hinders simultaneous optimization of efficiency and generalization. Method: This paper proposes a meta-learning-driven, end-to-end structured sparsity method that jointly models meta-learning with channel-level structured pruning—enabling learnable, task-shared sparsity parameters and dynamically generated shared structures without manual hyperparameter tuning. Built upon the MAML framework, it jointly optimizes shared weights and sparse topologies during meta-training and supports cross-task transfer of sparsity patterns to improve adaptability to unseen tasks. Contribution/Results: Evaluated on NYU-v2 and CelebAMask-HQ, the method outperforms state-of-the-art sparse MTL models: it reduces parameter count by 30–50%, accelerates inference by 1.8×–2.3×, and maintains or even improves multi-task accuracy.

Technology Category

Application Category

📝 Abstract

This paper presents meta-sparsity, a framework for learning model sparsity, basically learning the parameter that controls the degree of sparsity, that allows deep neural networks (DNNs) to inherently generate optimal sparse shared structures in multi-task learning (MTL) setting. This proposed approach enables the dynamic learning of sparsity patterns across a variety of tasks, unlike traditional sparsity methods that rely heavily on manual hyperparameter tuning. Inspired by Model Agnostic Meta-Learning (MAML), the emphasis is on learning shared and optimally sparse parameters in multi-task scenarios by implementing a penalty-based, channel-wise structured sparsity during the meta-training phase. This method improves the model's efficacy by removing unnecessary parameters and enhances its ability to handle both seen and previously unseen tasks. The effectiveness of meta-sparsity is rigorously evaluated by extensive experiments on two datasets, NYU-v2 and CelebAMask-HQ, covering a broad spectrum of tasks ranging from pixel-level to image-level predictions. The results show that the proposed approach performs well across many tasks, indicating its potential as a versatile tool for creating efficient and adaptable sparse neural networks. This work, therefore, presents an approach towards learning sparsity, contributing to the efforts in the field of sparse neural networks and suggesting new directions for research towards parsimonious models.

Problem

Research questions and friction points this paper is trying to address.

Multi-task Learning

Neural Architecture Search

Resource Efficiency

Innovation

Methods, ideas, or system contributions that make the work stand out.

Meta-Sparsification

Multi-Task Learning

Adaptive Structure Optimization

🔎 Similar Papers

No similar papers found.