ADMN: A Layer-Wise Adaptive Multimodal Network for Dynamic Input Noise and Compute Resources

📅 2025-02-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the efficiency-accuracy trade-off arising from dynamic computational resource fluctuations and degraded multimodal input quality (e.g., noise corruption) in real-world scenarios, this paper proposes a layer-granular cross-modal deep redistribution mechanism to construct a per-layer adaptive multimodal network. Our method jointly optimizes gated layer selection, modality-quality-aware scoring, and resource-aware gradient scheduling, augmented by a lightweight feature recalibration module, enabling real-time, coordinated adjustment of activation depth across modalities. By breaking away from static architectural constraints, the proposed framework preserves state-of-the-art (SOTA) accuracy while reducing floating-point operations by up to 75%. This significantly enhances robustness and energy efficiency on heterogeneous edge devices under dynamic workloads and noisy inputs.

Technology Category

Application Category

📝 Abstract
Multimodal deep learning systems are deployed in dynamic scenarios due to the robustness afforded by multiple sensing modalities. Nevertheless, they struggle with varying compute resource availability (due to multi-tenancy, device heterogeneity, etc.) and fluctuating quality of inputs (from sensor feed corruption, environmental noise, etc.). Current multimodal systems employ static resource provisioning and cannot easily adapt when compute resources change over time. Additionally, their reliance on processing sensor data with fixed feature extractors is ill-equipped to handle variations in modality quality. Consequently, uninformative modalities, such as those with high noise, needlessly consume resources better allocated towards other modalities. We propose ADMN, a layer-wise Adaptive Depth Multimodal Network capable of tackling both challenges - it adjusts the total number of active layers across all modalities to meet compute resource constraints, and continually reallocates layers across input modalities according to their modality quality. Our evaluations showcase ADMN can match the accuracy of state-of-the-art networks while reducing up to 75% of their floating-point operations.
Problem

Research questions and friction points this paper is trying to address.

Handles dynamic compute resource availability in multimodal systems.
Adapts to fluctuating input noise and quality variations.
Optimizes resource allocation across modalities for efficiency.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive Depth Multimodal Network
Dynamic layer reallocation
Reduced floating-point operations
🔎 Similar Papers
No similar papers found.