Towards HRTF Personalization using Denoising Diffusion Models

📅 2025-01-06

📈 Citations: 0

✨ Influential: 0

career value

206K/year

🤖 AI Summary

To address the scarcity of physiological parameters and the poor generalization of conventional interpolation and supervised learning in personalized HRTF modeling, this paper proposes the first end-to-end HRIR generation framework based on a conditional denoising diffusion probabilistic model (DDPM). Methodologically, anthropometric features—including pinna and head-torso geometry—are encoded as conditional inputs to guide the diffusion process, enabling direct time-domain synthesis of high-fidelity HRIRs without explicit physical modeling or dense ground-truth supervision. Experiments demonstrate that the generated HRIRs achieve state-of-the-art performance on spatial auditory perception metrics—such as azimuth identification accuracy and front-back confusion rate—significantly outperforming interpolation and regression baselines. This work constitutes the first empirical validation of diffusion models’ expressive power and generalization capability in acoustic personalization tasks, establishing a novel paradigm for lightweight, deployable HRTF customization in immersive audio applications.

Technology Category

Application Category

📝 Abstract

Head-Related Transfer Functions (HRTFs) have fundamental applications for realistic rendering in immersive audio scenarios. However, they are strongly subject-dependent as they vary considerably depending on the shape of the ears, head and torso. Thus, personalization procedures are required for accurate binaural rendering. Recently, Denoising Diffusion Probabilistic Models (DDPMs), a class of generative learning techniques, have been applied to solve a variety of signal processing-related problems. In this paper, we propose a first approach for using DDPM conditioned on anthropometric measurements to generate personalized Head-Related Impulse Response (HRIR), the time-domain representation of HRTF. The results show the feasibility of DDPMs for HRTF personalization obtaining performance in line with state-of-the-art models.

Problem

Research questions and friction points this paper is trying to address.

Personalized HRTF

Immersive Audio

Individual差异

Innovation

Methods, ideas, or system contributions that make the work stand out.

DDPMs

Personalized HRTFs

Immersive Audio

🔎 Similar Papers

A Survey on Personalized Content Synthesis with Diffusion Models