Translation-Equivariant Self-Supervised Learning for Pitch Estimation with Optimal Transport

📅 2025-08-02

📈 Citations: 0

✨ Influential: 0

career value

188K/year

🤖 AI Summary

This work addresses numerical instability and insufficient theoretical grounding in self-supervised learning for monophonic pitch estimation. We propose the first translation-equivariant self-supervised learning framework grounded in optimal transport (OT). Our method formalizes pitch translation invariance as Wasserstein distance minimization between one-dimensional probability distributions, yielding a theoretically rigorous, differentiable, and numerically stable loss function. Coupled with a translation-equivariant neural architecture, the framework enables end-to-end optimization. Crucially, this is the first systematic integration of OT theory into one-dimensional translation-equivariant signal modeling—replacing heuristic contrastive or reconstruction objectives prevalent in prior work. Evaluated on standard monophonic pitch estimation benchmarks, our approach achieves state-of-the-art performance, with marked improvements in training stability and generalization accuracy. These results empirically validate the framework’s theoretical soundness, computational robustness, and practical efficacy.

Technology Category

Application Category

📝 Abstract

In this paper, we propose an Optimal Transport objective for learning one-dimensional translation-equivariant systems and demonstrate its applicability to single pitch estimation. Our method provides a theoretically grounded, more numerically stable, and simpler alternative for training state-of-the-art self-supervised pitch estimators.

Problem

Research questions and friction points this paper is trying to address.

Develop translation-equivariant pitch estimation using Optimal Transport

Improve numerical stability in self-supervised pitch estimation

Simplify training for state-of-the-art pitch estimators

Innovation

Methods, ideas, or system contributions that make the work stand out.

Optimal Transport for translation-equivariant learning

Self-supervised pitch estimation method

Numerically stable and simpler training

🔎 Similar Papers

No similar papers found.