Unified Microphone Conversion: Many-to-Many Device Mapping via Feature-wise Linear Modulation

📅 2024-10-23

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

212K/year

🤖 AI Summary

To address performance degradation in sound event classification (SEC) caused by microphone device variability, this paper proposes a unified feature-mapping framework based on frequency-response-conditioned CycleGAN. The core innovation lies in the first incorporation of microphone frequency response information into CycleGAN via feature-level Feature-wise Linear Modulation (FiLM), enabling a single model to perform bidirectional, unpaired time-frequency feature translation across arbitrary microphone pairs. This eliminates the limitations of conventional pairwise modeling and substantially enhances cross-device robustness. Experiments on standard SEC benchmarks demonstrate that the method achieves a 2.6% absolute improvement in macro-averaged F1-score over the state of the art, while reducing inter-device feature distribution discrepancy by 0.8%, validating its effectiveness and generalizability.

Technology Category

Application Category

📝 Abstract

In this study, we introduce Unified Microphone Conversion, a unified generative framework to enhance the resilience of sound event classification systems against device variability. Building on the limitations of previous works, we condition the generator network with frequency response information to achieve many-to-many device mapping. This approach overcomes the inherent limitation of CycleGAN, requiring separate models for each device pair. Our framework leverages the strengths of CycleGAN for unpaired training to simulate device characteristics in audio recordings and significantly extends its scalability by integrating frequency response related information via Feature-wise Linear Modulation. The experiment results show that our method outperforms the state-of-the-art method by 2.6% and reducing variability by 0.8% in macro-average F1 score.

Problem

Research questions and friction points this paper is trying to address.

Unified framework for microphone device conversion

Enhances sound event classification against device variability

Enables many-to-many device mappings via unpaired training

Innovation

Methods, ideas, or system contributions that make the work stand out.

Unified generative framework for device variability

Feature-wise Linear Modulation for scalability

Unpaired training enables many-to-many device mappings

🔎 Similar Papers

No similar papers found.