Automatic Music Sample Identification with Multi-Track Contrastive Learning

📅 2025-10-13

📈 Citations: 0

✨ Influential: 0

career value

193K/year

🤖 AI Summary

This work addresses the problem of automatic identification and provenance tracing of musical samples in multi-track audio. The proposed method introduces a self-supervised contrastive learning framework specifically designed for this task. Its core innovation lies in generating high-fidelity artificial remix positive pairs via state-of-the-art source separation, thereby enabling contrastive learning objectives grounded in cross-version audio matching—effectively capturing sample variability across diverse mixing conditions, noise corruptions, and musical genres. The approach integrates audio signal processing, stem separation, and large-scale negative sampling retrieval, eliminating the need for manual annotations. Experiments demonstrate strong generalization across multiple genres and high robustness to noise, significantly outperforming prior state-of-the-art methods. Moreover, the method exhibits excellent scalability and stability when the reference database expands or when noisy query samples increase.

Technology Category

Application Category

📝 Abstract

Sampling, the technique of reusing pieces of existing audio tracks to create new music content, is a very common practice in modern music production. In this paper, we tackle the challenging task of automatic sample identification, that is, detecting such sampled content and retrieving the material from which it originates. To do so, we adopt a self-supervised learning approach that leverages a multi-track dataset to create positive pairs of artificial mixes, and design a novel contrastive learning objective. We show that such method significantly outperforms previous state-of-the-art baselines, that is robust to various genres, and that scales well when increasing the number of noise songs in the reference database. In addition, we extensively analyze the contribution of the different components of our training pipeline and highlight, in particular, the need for high-quality separated stems for this task.

Problem

Research questions and friction points this paper is trying to address.

Identifying reused audio samples in new music compositions automatically

Detecting sampled content and retrieving original source material

Developing robust sample identification across various music genres

Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-supervised learning with multi-track dataset

Novel contrastive learning objective design

Robust identification across various music genres

🔎 Similar Papers

COCOLA: Coherence-Oriented Contrastive Learning of Musical Audio Representations