SOAF: Scene Occlusion-aware Neural Acoustic Field

📅 2024-07-02

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

205K/year

🤖 AI Summary

This work addresses audio-visual novel-view synthesis along arbitrary trajectories in indoor scenes, identifying that existing methods neglect wall occlusion effects on sound propagation, leading to poor spatial audio consistency across multi-room environments. To address this, we propose an occlusion-aware neural acoustic field framework: (1) explicitly modeling wall geometry for occlusion reasoning; (2) jointly incorporating distance-aware parametric sound propagation priors and video-driven scene structure encoding; and (3) introducing Fibonacci sphere sampling for local acoustic feature extraction, coupled with a direction-aware attention mechanism to enhance sound-source directional modeling. Evaluated on the real-world RWAVS and synthetic SoundSpaces datasets, our method significantly outperforms state-of-the-art approaches, achieving substantial improvements in binaural audio fidelity and spatial consistency.

Technology Category

Application Category

📝 Abstract

This paper tackles the problem of novel view audio-visual synthesis along an arbitrary trajectory in an indoor scene, given the audio-video recordings from other known trajectories of the scene. Existing methods often overlook the effect of room geometry, particularly wall occlusions on sound propagation, making them less accurate in multi-room environments. In this work, we propose a new approach called Scene Occlusion-aware Acoustic Field (SOAF) for accurate sound generation. Our approach derives a global prior for the sound field using distance-aware parametric sound-propagation modeling and then transforms it based on the scene structure learned from the input video. We extract features from the local acoustic field centered at the receiver using a Fibonacci Sphere to generate binaural audio for novel views with a direction-aware attention mechanism. Extensive experiments on the real dataset RWAVS and the synthetic dataset SoundSpaces demonstrate that our method outperforms previous state-of-the-art techniques in audio generation.

Problem

Research questions and friction points this paper is trying to address.

Novel view audio-visual synthesis in indoor scenes

Modeling sound propagation with room geometry occlusion

Generating accurate binaural audio for arbitrary trajectories

Innovation

Methods, ideas, or system contributions that make the work stand out.

Distance-aware parametric sound-propagation modeling

Scene structure learning from input video

Direction-aware attention for binaural audio

🔎 Similar Papers

No similar papers found.