SPEAR-MM: Selective Parameter Evaluation and Restoration via Model Merging for Efficient Financial LLM Adaptation

📅 2025-11-11

📈 Citations: 0

✨ Influential: 0

career value

197K/year

🤖 AI Summary

To address severe forgetting of general reasoning capabilities—such as customer interaction and complex financial analysis—during large language model (LLM) adaptation to finance, this paper proposes a hierarchical model fusion method grounded in posterior impact analysis. Our approach integrates selective Transformer layer freezing with spherical interpolation-based fusion, leveraging external benchmark-driven posterior analysis to identify capability-critical layers, thereby enabling interpretable capability preservation and efficient parameter recovery. Experiments on LLaMA-3.1-8B demonstrate a 91.2% retention rate of general capabilities—21.5 percentage points higher than standard continual pretraining—while sustaining 94% domain adaptation gain and reducing computational cost by 90%. The core contribution lies in unifying interpretable impact assessment with geometrically constrained model fusion, achieving, for the first time in financial LLM adaptation, synergistic optimization of high-fidelity general capability retention and efficient domain-specific enhancement.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) adapted to financial domains often suffer from catastrophic forgetting of general reasoning capabilities essential for customer interactions and complex financial analysis. We introduce Selective Parameter Evaluation and Restoration via Model Merging (SPEAR-MM), a practical framework that preserves critical capabilities while enabling domain adaptation. Our method approximates layer-wise impact on external benchmarks through post-hoc analysis, then selectively freezes or restores transformer layers via spherical interpolation merging. Applied to LLaMA-3.1-8B for financial tasks, SPEAR-MM achieves 91.2% retention of general capabilities versus 69.7% for standard continual pretraining, while maintaining 94% of domain adaptation gains. The approach provides interpretable trade-off control and reduces computational costs by 90% crucial for resource-constrained financial institutions.

Problem

Research questions and friction points this paper is trying to address.

Prevents catastrophic forgetting of general reasoning in financial LLMs

Preserves critical capabilities while enabling domain adaptation

Reduces computational costs by 90% for resource-constrained institutions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Selective parameter freezing via model merging

Spherical interpolation for layer restoration

Post-hoc analysis for capability preservation

🔎 Similar Papers

No similar papers found.