UNMIXX: Untangling Highly Correlated Singing Voices Mixtures

📅 2026-01-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the performance bottleneck in multi-part singing voice separation caused by scarce training data and high source correlation. To this end, the authors propose UNMIXX, a novel framework that integrates three key components: a music-aware mixing strategy to synthesize realistic training data, a reverse attention–driven cross-source representation disentanglement mechanism within the model architecture, and an amplitude-penalized loss function designed to enhance separation discriminability. These components are jointly optimized to significantly improve the separation of highly correlated vocal signals. Experimental results demonstrate that UNMIXX outperforms existing methods across multiple metrics, achieving an improvement of over 2.2 dB in SDRi (Source-to-Distortion Ratio improvement).

Technology Category

Application Category

📝 Abstract
We introduce UNMIXX, a novel framework for multiple singing voices separation (MSVS). While related to speech separation, MSVS faces unique challenges: data scarcity and the highly correlated nature of singing voices mixture. To address these issues, we propose UNMIXX with three key components: (1) musically informed mixing strategy to construct highly correlated, music-like mixtures, (2) cross-source attention that drives representations of two singers apart via reverse attention, and (3) magnitude penalty loss penalizing erroneously assigned interfering energy. UNMIXX not only addresses data scarcity by simulating realistic training data, but also excels at separating highly correlated mixtures through cross-source interactions at both the architectural and loss levels. Our extensive experiments demonstrate that UNMIXX greatly enhances performance, with SDRi gains exceeding 2.2 dB over prior work.
Problem

Research questions and friction points this paper is trying to address.

singing voice separation
highly correlated mixtures
data scarcity
multiple singing voices
Innovation

Methods, ideas, or system contributions that make the work stand out.

singing voice separation
cross-source attention
musically informed mixing
magnitude penalty loss
highly correlated mixtures
🔎 Similar Papers
No similar papers found.