Robust Embodied Perception in Dynamic Environments via Disentangled Weight Fusion

📅 2026-04-02

📈 Citations: 0

✨ Influential: 0

career value

209K/year

🤖 AI Summary

This work addresses the challenges of poor generalization and catastrophic forgetting in embodied perception systems operating in dynamic, open-world environments, where distribution shifts are prevalent. The authors propose a novel continual learning framework that operates without access to domain labels or rehearsal of historical samples. By decoupling representation learning to disentangle environmental style from semantic content, the method focuses on capturing invariant, task-relevant features across scenes. Knowledge from previous and new tasks is integrated through a dynamic weight fusion strategy in parameter space. As the first continual learning paradigm for embodied perception that requires neither domain labels nor sample replay, the framework significantly outperforms existing approaches on multiple standard benchmarks, effectively mitigating catastrophic forgetting and enhancing both accuracy and robustness under fully replay-free and label-free conditions.

Technology Category

Application Category

📝 Abstract

Embodied perception systems face severe challenges of dynamic environment distribution drift when they continuously interact in open physical spaces. However, the existing domain incremental awareness methods often rely on the domain id obtained in advance during the testing phase, which limits their practicability in unknown interaction scenarios. At the same time, the model often overfits to the context-specific perceptual noise, which leads to insufficient generalization ability and catastrophic forgetting. To address these limitations, we propose a domain-id and exemplar-free incremental learning framework for embodied multimedia systems, which aims to achieve robust continuous environment adaptation. This method designs a disentangled representation mechanism to remove non-essential environmental style interference, and guide the model to focus on extracting semantic intrinsic features shared across scenes, thereby eliminating perceptual uncertainty and improving generalization. We further use the weight fusion strategy to dynamically integrate the old and new environment knowledge in the parameter space, so as to ensure that the model adapts to the new distribution without storing historical data and maximally retains the discrimination ability of the old environment. Extensive experiments on multiple standard benchmark datasets show that the proposed method significantly reduces catastrophic forgetting in a completely exemplar-free and domain-id free setting, and its accuracy is better than the existing state-of-the-art methods.

Problem

Research questions and friction points this paper is trying to address.

embodied perception

distribution drift

catastrophic forgetting

domain incremental learning

generalization

Innovation

Methods, ideas, or system contributions that make the work stand out.

disentangled representation

weight fusion

exemplar-free incremental learning