SAFE: Finding Sparse and Flat Minima to Improve Pruning

📅 2025-06-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Neural network pruning often struggles to balance sparsity and performance. This paper formulates pruning as a sparse-constrained optimization problem and, for the first time, explicitly incorporates flat minima—known to enhance generalization—as a primary optimization objective, establishing a novel paradigm that jointly optimizes sparsity and flatness. Leveraging the augmented Lagrangian dual method, we design a generalized projection operator and propose two pruning algorithms: SAFE and SAFE$^+$. Our framework unifies structured and unstructured pruning and intrinsically integrates flatness regularization. Extensive experiments on image classification and language modeling tasks demonstrate that, at equivalent sparsity levels, our methods consistently outperform state-of-the-art pruning approaches—achieving higher accuracy and exhibiting stronger robustness to noisy data.

Technology Category

Application Category

📝 Abstract
Sparsifying neural networks often suffers from seemingly inevitable performance degradation, and it remains challenging to restore the original performance despite much recent progress. Motivated by recent studies in robust optimization, we aim to tackle this problem by finding subnetworks that are both sparse and flat at the same time. Specifically, we formulate pruning as a sparsity-constrained optimization problem where flatness is encouraged as an objective. We solve it explicitly via an augmented Lagrange dual approach and extend it further by proposing a generalized projection operation, resulting in novel pruning methods called SAFE and its extension, SAFE$^+$. Extensive evaluations on standard image classification and language modeling tasks reveal that SAFE consistently yields sparse networks with improved generalization performance, which compares competitively to well-established baselines. In addition, SAFE demonstrates resilience to noisy data, making it well-suited for real-world conditions.
Problem

Research questions and friction points this paper is trying to address.

Finding sparse and flat minima for neural networks
Improving pruning performance via constrained optimization
Enhancing generalization and noise resilience in pruning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Sparsity-constrained optimization with flatness objective
Augmented Lagrange dual approach for explicit solving
Generalized projection operation for improved pruning
🔎 Similar Papers
No similar papers found.
D
Dongyeop Lee
POSTECH, South Korea
K
Kwanhee Lee
POSTECH, South Korea
J
Jinseok Chung
POSTECH, South Korea
Namhoon Lee
Namhoon Lee
POSTECH
machine learning