🤖 AI Summary
To address the weak spatial awareness and lack of natural interaction in conventional desktop-based 3D audio design tools, this paper proposes a six-degree-of-freedom (6DoF) spatial audio design framework tailored for extended reality (XR). Implemented on Apple Vision Pro, the system features an augmented reality (AR) audio design interface integrating 6DoF hand–eye tracking, real-time binaural rendering, and cross-modal feedback to enable intuitive manipulation of virtual sound sources. We introduce two novel design paradigms: “embodiment-aware AR sound design” and “audio-visual modality balance in AR GUIs.” A user study with 27 participants—including domain experts and novices—demonstrates that 6DoF interaction significantly improves spatial localization accuracy (+31.2%) and design intuitiveness. The results further identify high-potential application pathways in education, game development, and accessibility.
📝 Abstract
We present AudioMiXR, an augmented reality (AR) interface intended to assess how users manipulate virtual audio objects situated in their physical space using six degrees of freedom (6DoF) deployed on a head-mounted display (Apple Vision Pro) for 3D sound design. Existing tools for 3D sound design are typically constrained to desktop displays, which may limit spatial awareness of mixing within the execution environment. Utilizing an XR HMD to create soundscapes may provide a real-time test environment for 3D sound design, as modern HMDs can provide precise spatial localization assisted by cross-modal interactions. However, there is no research on design guidelines specific to sound design with six degrees of freedom (6DoF) in XR. To provide a first step toward identifying design-related research directions in this space, we conducted an exploratory study where we recruited 27 participants, consisting of expert and non-expert sound designers. The goal was to assess design lessons that can be used to inform future research venues in 3D sound design. We ran a within-subjects study where users designed both a music and cinematic soundscapes. After thematically analyzing participant data, we constructed two design lessons: 1. Proprioception for AR Sound Design, and 2. Balancing Audio-Visual Modalities in AR GUIs. Additionally, we provide application domains that can benefit most from 6DoF sound design based on our results.