UTA-Sign: Unsupervised Thermal Video Augmentation via Event-Assisted Traffic Signage Sketching

📅 2025-08-28

📈 Citations: 0

✨ Influential: 0

career value

196K/year

🤖 AI Summary

Thermal imaging cameras offer robust perception under low-light conditions but struggle to distinguish traffic signs with similar thermal emissivity (e.g., road signs and license plates), leading to semantic understanding failures in autonomous driving. To address this nighttime sign perception blind spot, we propose an unsupervised thermal–event video fusion enhancement method. Our approach features: (1) a motion-guided spatiotemporal alignment network that leverages coarse motion cues from thermal frames to synchronize asynchronous event streams; and (2) a detail enhancement module that exploits high-temporal-resolution event signals to compensate for texture deficiencies in thermal imagery, enabling cross-modal complementarity and temporally consistent representation. Evaluated on a real-world low-light dataset, our method significantly improves sign contour generation quality and detection accuracy (mAP increased by 12.7%), thereby enhancing the robustness of nighttime semantic perception.

Technology Category

Application Category

📝 Abstract

The thermal camera excels at perceiving outdoor environments under low-light conditions, making it ideal for applications such as nighttime autonomous driving and unmanned navigation. However, thermal cameras encounter challenges when capturing signage from objects made of similar materials, which can pose safety risks for accurately understanding semantics in autonomous driving systems. In contrast, the neuromorphic vision camera, also known as an event camera, detects changes in light intensity asynchronously and has proven effective in high-speed, low-light traffic environments. Recognizing the complementary characteristics of these two modalities, this paper proposes UTA-Sign, an unsupervised thermal-event video augmentation for traffic signage in low-illumination environments, targeting elements such as license plates and roadblock indicators. To address the signage blind spots of thermal imaging and the non-uniform sampling of event cameras, we developed a dual-boosting mechanism that fuses thermal frames and event signals for consistent signage representation over time. The proposed method utilizes thermal frames to provide accurate motion cues as temporal references for aligning the uneven event signals. At the same time, event signals contribute subtle signage content to the raw thermal frames, enhancing the overall understanding of the environment. The proposed method is validated on datasets collected from real-world scenarios, demonstrating superior quality in traffic signage sketching and improved detection accuracy at the perceptual level.

Problem

Research questions and friction points this paper is trying to address.

Augmenting thermal video for traffic signage in low-light conditions

Fusing thermal and event camera data to overcome sensing limitations

Improving autonomous driving safety by enhancing signage visibility

Innovation

Methods, ideas, or system contributions that make the work stand out.

Unsupervised thermal-event video fusion

Dual-boosting mechanism combining modalities

Event signals enhance thermal frame details

🔎 Similar Papers

Human-in-the-loop Reasoning For Traffic Sign Detection: Collaborative Approach Yolo With Video-llava