Neural Network Pruning via QUBO Optimization

📅 2026-04-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitations of existing neural network pruning methods, which often rely on greedy strategies that neglect filter interactions or employ oversimplified QUBO formulations—such as those based solely on L1-norm regularization—that compromise performance. To overcome these issues, the authors propose a hybrid QUBO pruning framework that integrates gradient-aware importance estimation with global combinatorial optimization, enforcing strict adherence to target sparsity through dynamic capacity-driven search. The approach innovatively incorporates first-order Taylor expansions and second-order Fisher information into the QUBO linear terms, while constructing quadratic terms from data-driven activation similarities. Furthermore, a Tensor-Train (TT) refinement stage is introduced to directly optimize actual performance metrics. Evaluated on the SIDD image denoising task, the method significantly outperforms greedy Taylor-based pruning and L1-QUBO baselines, with TT refinement consistently enhancing performance when applied at appropriate combinatorial scales.
📝 Abstract
Neural network pruning can be formulated as a combinatorial optimization problem, yet most existing approaches rely on greedy heuristics that ignore complex interactions between filters. Formal optimization methods such as Quadratic Unconstrained Binary Optimization (QUBO) provide a principled alternative but have so far underperformed due to oversimplified objective formulations based on metrics like the L1-norm. In this work, we propose a unified Hybrid QUBO framework that bridges heuristic importance estimation with global combinatorial optimization. Our formulation integrates gradient-aware sensitivity metrics - specifically first-order Taylor and second-order Fisher information - into the linear term, while utilizing data-driven activation similarity in the quadratic term. This allows the QUBO objective to jointly capture individual filter relevance and inter-filter functional redundancy. We further introduce a dynamic capacity-driven search to strictly enforce target sparsity without distorting the optimization landscape. Finally, we employ a two-stage pipeline featuring a Tensor-Train (TT) Refinement stage - a gradient-free optimizer that fine-tunes the QUBO-derived solution directly against the true evaluation metric. Experiments on the SIDD image denoising dataset demonstrate that the proposed Hybrid QUBO significantly outperforms both greedy Taylor pruning and traditional L1-based QUBO, with TT Refinement providing further consistent gains at appropriate combinatorial scales. This highlights the potential of hybrid combinatorial formulations for robust, scalable, and interpretable neural network compression.
Problem

Research questions and friction points this paper is trying to address.

Neural Network Pruning
Combinatorial Optimization
QUBO
Filter Redundancy
Sparsity
Innovation

Methods, ideas, or system contributions that make the work stand out.

QUBO optimization
neural network pruning
gradient-aware sensitivity
activation similarity
Tensor-Train refinement
🔎 Similar Papers
No similar papers found.
O
Osama Orabi
Department of Computer Science and Engineering, Innopolis University, 420500 Innopolis, Russia; Q Deep, Laboratory of Quantum Computing, 420502 Innopolis, Russia; Research Center of the Artificial Intelligence Institute, Innopolis University, 420500 Innopolis, Russia
A
Artur Zagitov
Department of Computer Science and Engineering, Innopolis University, 420500 Innopolis, Russia; Q Deep, Laboratory of Quantum Computing, 420502 Innopolis, Russia; Research Center of the Artificial Intelligence Institute, Innopolis University, 420500 Innopolis, Russia
H
Hadi Salloum
Department of Computer Science and Engineering, Innopolis University, 420500 Innopolis, Russia; Q Deep, Laboratory of Quantum Computing, 420502 Innopolis, Russia; Research Center of the Artificial Intelligence Institute, Innopolis University, 420500 Innopolis, Russia
V
Viktor A. Lobachev
Moscow Institute of Physics and Technology, 141701 Dolgoprudny, Russia
Kasymkhan Khubiev
Kasymkhan Khubiev
НТУ Сириус
artificial intelligencemachine learningquantitative finance
Yaroslav Kholodov
Yaroslav Kholodov
Full professor of Innopolis University
Data analysisIntelligent transportation systemsNumerical methodsApplied mathematics