Unified Microphone Conversion: Many-to-Many Device Mapping via Feature-wise Linear Modulation

📅 2024-10-23
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address performance degradation in sound event classification (SEC) caused by microphone device variability, this paper proposes a unified feature-mapping framework based on frequency-response-conditioned CycleGAN. The core innovation lies in the first incorporation of microphone frequency response information into CycleGAN via feature-level Feature-wise Linear Modulation (FiLM), enabling a single model to perform bidirectional, unpaired time-frequency feature translation across arbitrary microphone pairs. This eliminates the limitations of conventional pairwise modeling and substantially enhances cross-device robustness. Experiments on standard SEC benchmarks demonstrate that the method achieves a 2.6% absolute improvement in macro-averaged F1-score over the state of the art, while reducing inter-device feature distribution discrepancy by 0.8%, validating its effectiveness and generalizability.

Technology Category

Application Category

📝 Abstract
In this study, we introduce Unified Microphone Conversion, a unified generative framework to enhance the resilience of sound event classification systems against device variability. Building on the limitations of previous works, we condition the generator network with frequency response information to achieve many-to-many device mapping. This approach overcomes the inherent limitation of CycleGAN, requiring separate models for each device pair. Our framework leverages the strengths of CycleGAN for unpaired training to simulate device characteristics in audio recordings and significantly extends its scalability by integrating frequency response related information via Feature-wise Linear Modulation. The experiment results show that our method outperforms the state-of-the-art method by 2.6% and reducing variability by 0.8% in macro-average F1 score.
Problem

Research questions and friction points this paper is trying to address.

Unified framework for microphone device conversion
Enhances sound event classification against device variability
Enables many-to-many device mappings via unpaired training
Innovation

Methods, ideas, or system contributions that make the work stand out.

Unified generative framework for device variability
Feature-wise Linear Modulation for scalability
Unpaired training enables many-to-many device mappings
🔎 Similar Papers
No similar papers found.
M
Myeonghoon Ryu
Deeply Inc., Seoul National University
Hongseok Oh
Hongseok Oh
University of Seoul
Large Language Model
S
Suji Lee
Deeply Inc.
H
Han Park
Deeply Inc.