Discovering Decoupled Functional Modules in Large Language Models

📅 2026-03-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
The internal organization of functional modules in large language models (LLMs) remains poorly understood, and effective methods for disentangling neurons and linking them to semantic concepts are lacking. To address this gap, this work proposes ULCMOD, an unsupervised cross-layer module discovery framework that introduces a novel objective function and an iterative disentanglement (IterD) algorithm. ULCMOD enables, for the first time, a comprehensive functional partitioning of neurons across the entire LLM architecture and aligns these partitions with the thematic semantics of input samples. Experimental results demonstrate that the discovered modules exhibit clear semantic coherence, a hierarchical spatial structure, and task specialization, leading to strong performance on downstream tasks. This approach significantly enhances model interpretability and fills a critical void in the study of functional disentanglement in LLMs.

Technology Category

Application Category

📝 Abstract
Understanding the internal functional organization of Large Language Models (LLMs) is crucial for improving their trustworthiness and performance. However, how LLMs organize different functions into modules remains highly unexplored. To bridge this gap, we formulate a functional module discovery problem and propose an Unsupervised LLM Cross-layer MOdule Discovery (ULCMOD) framework that simultaneously disentangles the large set of neurons in the entire LLM into modules while discovering the topics of input samples related to these modules. Our framework introduces a novel objective function and an efficient Iterative Decoupling (IterD) algorithm. Extensive experiments show that our method discovers high-quality, disentangled modules that capture more meaningful semantic information and achieve superior performance in various downstream tasks. Moreover, our qualitative analysis reveals that the discovered modules show semantic coherence, correspond to interpretable specializations, and a clear spatial and hierarchical organization within the LLM. Our work provides a novel tool for interpreting the functional modules of LLMs, filling a critical blank in LLM's interpretability research.
Problem

Research questions and friction points this paper is trying to address.

functional modules
Large Language Models
module discovery
interpretability
neuron disentanglement
Innovation

Methods, ideas, or system contributions that make the work stand out.

functional module discovery
unsupervised disentanglement
large language models
Iterative Decoupling
model interpretability
🔎 Similar Papers
No similar papers found.
Y
Yanke Yu
Thrust of AI, Hong Kong University of Science and Technology (Guangzhou)
J
Jin Li
Thrust of AI, Hong Kong University of Science and Technology (Guangzhou)
Ying Sun
Ying Sun
The Hong Kong University of Science and Technology (Guangzhou)
Data MiningMachine Learning
Ping Li
Ping Li
Huawei Cloud
Zhefeng Wang
Zhefeng Wang
Huawei Cloud
NLPAI systemLLMmulti-modalityMachine Learning
Y
Yi Zheng
Huawei Technologies Ltd.