๐ค AI Summary
This work addresses a key challenge in integrating strong reasoning capabilities with domain-specific expertiseโnamely, how to effectively inject advanced reasoning without compromising specialized performance. The authors propose ReasonAny, a framework that reveals reasoning abilities are predominantly encoded in parameter regions with low gradient sensitivity. Leveraging this insight, they develop a training-free model merging strategy that precisely identifies and preserves critical parameters for both reasoning and domain tasks through gradient-based contrastive analysis. This approach circumvents the common pitfall of conventional fusion methods, which often degrade both reasoning depth and domain performance. Evaluated across diverse domains including safety, biomedicine, and finance, ReasonAny consistently outperforms existing techniques, simultaneously enhancing reasoning capability and domain-specific task accuracy.
๐ Abstract
Large Reasoning Models (LRMs) with long chain-of-thought reasoning have recently achieved remarkable success. Yet, equipping domain-specialized models with such reasoning capabilities, referred to as"Reasoning + X", remains a significant challenge. While model merging offers a promising training-free solution, existing methods often suffer from a destructive performance collapse: existing methods tend to both weaken reasoning depth and compromise domain-specific utility. Interestingly, we identify a counter-intuitive phenomenon underlying this failure: reasoning ability predominantly resides in parameter regions with low gradient sensitivity, contrary to the common assumption that domain capabilities correspond to high-magnitude parameters. Motivated by this insight, we propose ReasonAny, a novel merging framework that resolves the reasoning-domain performance collapse through Contrastive Gradient Identification. Experiments across safety, biomedicine, and finance domains show that ReasonAny effectively synthesizes"Reasoning + X"capabilities, significantly outperforming state-of-the-art baselines while retaining robust reasoning performance.