StylePitcher: Generating Style-Following and Expressive Pitch Curves for Versatile Singing Tasks

📅 2025-10-24

📈 Citations: 0

✨ Influential: 0

career value

184K/year

🤖 AI Summary

Current pitch curve generators suffer from two key limitations: (1) difficulty in modeling singer-specific expressivity, and (2) task-specific design—e.g., for pitch correction or singing voice synthesis—resulting in poor generalizability. This paper introduces the first cross-task transferable framework for universal pitch curve generation. It implicitly learns vocal style from reference audio while preserving melodic alignment, enabling high-fidelity stylistic modeling. Our method builds upon a modified flow-matching architecture, conditioned jointly on symbolic musical scores and pitch-context features, to generate stylistic pitch curves end-to-end. Experiments demonstrate that our model significantly outperforms baselines in style similarity and audio naturalness, with marked improvements in subjective evaluation, while maintaining pitch accuracy comparable to task-specialized models.

Technology Category

Application Category

📝 Abstract

Existing pitch curve generators face two main challenges: they often neglect singer-specific expressiveness, reducing their ability to capture individual singing styles. And they are typically developed as auxiliary modules for specific tasks such as pitch correction, singing voice synthesis, or voice conversion, which restricts their generalization capability. We propose StylePitcher, a general-purpose pitch curve generator that learns singer style from reference audio while preserving alignment with the intended melody. Built upon a rectified flow matching architecture, StylePitcher flexibly incorporates symbolic music scores and pitch context as conditions for generation, and can seamlessly adapt to diverse singing tasks without retraining. Objective and subjective evaluations across various singing tasks demonstrate that StylePitcher improves style similarity and audio quality while maintaining pitch accuracy comparable to task-specific baselines.

Problem

Research questions and friction points this paper is trying to address.

Generating singer-specific expressive pitch curves for versatile singing tasks

Overcoming limitations of task-specific pitch generators lacking generalization capability

Learning singer style from reference audio while maintaining melody alignment

Innovation

Methods, ideas, or system contributions that make the work stand out.

Learns singer style from reference audio

Uses rectified flow matching architecture

Adapts to diverse tasks without retraining

🔎 Similar Papers

TCSinger: Zero-Shot Singing Voice Synthesis with Style Transfer and Multi-Level Style Control