A Cold Diffusion Approach for Percussive Dereverberation

📅 2026-05-11
📈 Citations: 0
Influential: 0
📄 PDF

career value

214K/year
🤖 AI Summary
This work addresses the challenging problem of dereverberation for percussive signals, which exhibit sharp transients and dense time-domain structures that limit the effectiveness of existing speech-oriented methods. For the first time, we introduce a cold diffusion framework to this task, modeling reverberation as a deterministic degradation process and exploring two parameterization strategies for the reverse process: direct prediction and Delta-normalized residual prediction. Leveraging UNet and Diffusion Transformer architectures, our model is trained on data synthesized with both simulated and real room impulse responses and evaluated using percussion-specific signal and perceptual metrics. The proposed approach significantly outperforms current score-based and conditional diffusion baselines on both in-domain and completely out-of-domain test sets, demonstrating the efficacy and strong generalization capability of our transient-aware audio diffusion modeling strategy.
📝 Abstract
Most recent advances in audio dereverberation focus almost exclusively on speech, leaving percussive and drum signals largely unexplored despite their importance in music production. Percussive dereverberation poses distinct challenges due to sharp transients and dense temporal structure. In this work, we propose a cold diffusion framework for dereverberating stereo drum stems (downmixes), modeling reverberation as a deterministic degradation process that progressively transforms anechoic signals into reverberant ones. We investigate two reverse-process parameterizations, Direct (next-state) and a Delta-normalized residual (velocity-style) prediction, and implement the framework using both a UNet and a diffusion Transformer backbone. The models are trained and evaluated on curated datasets comprising both acoustic and electronic drum recordings, with reverberation generated using a combination of synthetic and real room impulse responses. Extensive experiments on in-domain and fully out-of-domain test sets demonstrate that the proposed method consistently outperforms strong score-based and conditional diffusion baselines, evaluated using signal-based and perceptual metrics tailored to percussive audio.
Problem

Research questions and friction points this paper is trying to address.

percussive dereverberation
audio dereverberation
drum signals
reverberation
transient signals
Innovation

Methods, ideas, or system contributions that make the work stand out.

cold diffusion
percussive dereverberation
deterministic degradation
diffusion Transformer
drum stems