🤖 AI Summary
This work addresses the challenge of effectively fusing orthogonally fine-tuned adapters—each specialized for distinct concept and style tasks—without requiring additional training, to enable high-quality multi-attribute image generation. Leveraging the Riemannian manifold structure of Group-and-Shuffle orthogonal matrices, the authors propose a training-free multiplicative adapter fusion method. This approach employs an efficiently approximated geodesic interpolation formula on the manifold and incorporates a spectral restoration transformation to preserve the spectral properties of the fused adapter. The method achieves, for the first time, joint generation using orthogonally fine-tuned adapters without further fine-tuning, successfully synthesizing high-fidelity images that simultaneously embody specified styles and concepts in subject-driven generation tasks, thereby demonstrating its effectiveness and superiority.
📝 Abstract
In a rapidly growing field of model training there is a constant practical interest in parameter-efficient fine-tuning and various techniques that use a small amount of training data to adapt the model to a narrow task. However, there is an open question: how to combine several adapters tuned for different tasks into one which is able to yield adequate results on both tasks? Specifically, merging subject and style adapters for generative models remains unresolved. In this paper we seek to show that in the case of orthogonal fine-tuning (OFT), we can use structured orthogonal parametrization and its geometric properties to get the formulas for training-free adapter merging. In particular, we derive the structure of the manifold formed by the recently proposed Group-and-Shuffle ($\mathcal{GS}$) orthogonal matrices, and obtain efficient formulas for the geodesics approximation between two points. Additionally, we propose a $\text{spectra restoration}$ transform that restores spectral properties of the merged adapter for higher-quality fusion. We conduct experiments in subject-driven generation tasks showing that our technique to merge two $\mathcal{GS}$ orthogonal matrices is capable of uniting concept and style features of different adapters. To the best of our knowledge, this is the first training-free method for merging multiplicative orthogonal adapters. Code is available via the $\href{https://github.com/ControlGenAI/OrthoFuse}{link}$.