Pruning Large Language Models by Identifying and Preserving Functional Networks

📅 2025-08-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing LLM structured pruning methods overlook functional collaboration among neurons, often disrupting the model’s macro-level functional architecture and causing substantial performance degradation. To address this, we propose a functional-network-based pruning paradigm—introducing, for the first time, principles from cognitive neuroscience’s functional network analysis into LLM compression. We conceptualize the LLM as a “digital brain” and identify cross-layer collaborative functional subnetworks via activation pattern clustering; critical neurons within these subnetworks are then preserved in a structured manner. This approach enables efficient pruning while maintaining global functional integrity. Experiments across multiple benchmark tasks show that, with 30–50% model size reduction, performance drops by only 0.5–2.1 percentage points—significantly outperforming baseline methods. Our implementation is publicly available.

Technology Category

Application Category

📝 Abstract
Structured pruning is one of the representative techniques for compressing large language models (LLMs) to reduce GPU memory consumption and accelerate inference speed. It offers significant practical value in improving the efficiency of LLMs in real-world applications. Current structured pruning methods typically rely on assessment of the importance of the structure units and pruning the units with less importance. Most of them overlooks the interaction and collaboration among artificial neurons that are crucial for the functionalities of LLMs, leading to a disruption in the macro functional architecture of LLMs and consequently a pruning performance degradation. Inspired by the inherent similarities between artificial neural networks and functional neural networks in the human brain, we alleviate this challenge and propose to prune LLMs by identifying and preserving functional networks within LLMs in this study. To achieve this, we treat an LLM as a digital brain and decompose the LLM into functional networks, analogous to identifying functional brain networks in neuroimaging data. Afterwards, an LLM is pruned by preserving the key neurons within these functional networks. Experimental results demonstrate that the proposed method can successfully identify and locate functional networks and key neurons in LLMs, enabling efficient model pruning. Our code is available at https://github.com/WhatAboutMyStar/LLM_ACTIVATION.
Problem

Research questions and friction points this paper is trying to address.

Identify functional networks in large language models
Preserve key neurons during structured pruning
Improve pruning performance by maintaining model functionality
Innovation

Methods, ideas, or system contributions that make the work stand out.

Prunes LLMs by preserving functional networks
Treats LLM as digital brain for decomposition
Identifies key neurons within functional networks
🔎 Similar Papers
No similar papers found.
Y
Yiheng Liu
School of Automation, Northwestern Polytechnical University, Xi’an, China
J
Junhao Ning
School of Automation, Northwestern Polytechnical University, Xi’an, China
S
Sichen Xia
School of Automation, Northwestern Polytechnical University, Xi’an, China
Xiaohui Gao
Xiaohui Gao
Associate Professor of Finance (Research), Temple University
Finance
N
Ning Qiang
School of Pyhisic and Information Technology, Shaanxi Normal University, Xi’an
B
Bao Ge
School of Automation, Northwestern Polytechnical University, Xi’an, China
J
Junwei Han
School of Automation, Northwestern Polytechnical University, Xi’an, China
Xintao Hu
Xintao Hu
Northwestern Polytechnical University
neuroimagemultimediamachine learning