Seeing Through Fog: Towards Fog-Invariant Action Recognition

📅 2026-05-19

📈 Citations: 0

✨ Influential: 0

career value

223K/year

🤖 AI Summary

This work addresses the significant performance degradation of action recognition under foggy conditions caused by reduced visibility and contrast. To tackle this challenge, the authors introduce FogAct, the first paired clear-foggy video benchmark dataset, and propose FogNet, a dual-stream CLIP-based architecture. FogNet leverages contrastive learning and cross-modal alignment to guide foggy video representations with their clear counterparts, thereby learning weather-invariant semantic features for actions and enabling semantic consistency across weather conditions. Experimental results demonstrate that FogNet achieves state-of-the-art performance on FogAct as well as three mainstream action recognition benchmarks, confirming its effectiveness and strong generalization capability in foggy environments.

📝 Abstract

Foggy conditions are commonly encountered in real-world applications; however, existing action recognition approaches typically assume favorable weather and high-quality video inputs. On foggy days, unpredictable visibility degradation and reduced contrast obstruct the extraction of semantic cues, posing significant challenges for current action recognition methods. In this paper, we mitigate the issues faced in action recognition under foggy conditions by employing two strategies. First, we present FogAct, the first benchmark dataset for foggy action recognition, consisting of paired clean and foggy videos captured with a stereo camera system. The dataset spans 10 scenes and 55 action categories, comprising nearly 10,000 video clips. Second, we propose FogNet, a two-stream CLIP model that discovers fog-invariant semantic information hidden behind the degraded videos. FogNet learns robust representations of foggy videos with guidance from clean videos, effectively capturing shared structural and motion cues between clean and foggy videos. Extensive experiments on FogAct and three other popular datasets demonstrate that our method achieves competitive performance compared with state-of-the-art (SOTA) approaches. Our FogAct and FogNet are given in our project page.

Problem

Research questions and friction points this paper is trying to address.

foggy conditions

action recognition

visibility degradation

contrast reduction

semantic cues

Innovation

Methods, ideas, or system contributions that make the work stand out.

fog-invariant action recognition

FogAct dataset

FogNet