Disentangling Static and Dynamic Information for Reducing Static Bias in Action Recognition

📅 2025-09-26

📈 Citations: 0

✨ Influential: 0

career value

164K/year

🤖 AI Summary

Action recognition models often suffer from “static bias,” wherein they over-rely on static scene cues (e.g., background, objects), leading to poor generalization—especially in zero-shot settings. To address this, we propose a dual-stream disentanglement framework that explicitly separates static (biased) and dynamic (unbiased) representations. We enforce statistical independence between the two streams via an independence loss and further constrain the static stream to encode only scene information using a scene prediction loss, thereby suppressing its interference with action classification. The method requires no additional annotations and is plug-and-play compatible with mainstream architectures. Experiments across multiple benchmarks demonstrate substantial mitigation of static bias: on zero-shot action recognition, our approach achieves an average accuracy improvement of 8.2%. Moreover, it enhances robustness in real-world scenarios and improves model interpretability.

Technology Category

Application Category

📝 Abstract

Action recognition models rely excessively on static cues rather than dynamic human motion, which is known as static bias. This bias leads to poor performance in real-world applications and zero-shot action recognition. In this paper, we propose a method to reduce static bias by separating temporal dynamic information from static scene information. Our approach uses a statistical independence loss between biased and unbiased streams, combined with a scene prediction loss. Our experiments demonstrate that this method effectively reduces static bias and confirm the importance of scene prediction loss.

Problem

Research questions and friction points this paper is trying to address.

Reducing static bias in action recognition models

Separating temporal dynamic from static scene information

Improving performance in real-world and zero-shot scenarios

Innovation

Methods, ideas, or system contributions that make the work stand out.

Separates temporal dynamic from static scene information

Uses statistical independence loss between biased streams

Combines with scene prediction loss to reduce bias

🔎 Similar Papers

No similar papers found.