🤖 AI Summary
Existing research lacks a publicly available benchmark dataset enabling joint evaluation of visible light communication (VLC) decoding and pose estimation using event cameras under realistic scenarios.
Method: This paper introduces the first synchronized, multi-condition dataset featuring co-registered event streams, intensity frames, and ground-truth 6-DOF poses—captured indoors and outdoors under varying illumination and motion patterns. We propose a contrast-maximization-based, event-driven LED localization framework that eliminates reliance on conventional AR markers or intensity frames, enabling low-latency, motion-compensated pose estimation. The system integrates VLC bitstream decoding, multi-condition camera–LED calibration, and automated ground-truth generation.
Contribution/Results: Experiments demonstrate that our method reduces pose estimation error by 37% compared to AR-marker-based approaches under challenging conditions—including motion blur and low-light scenes—establishing a new benchmark and methodological paradigm for event-camera-driven mobile VLC and localization.
📝 Abstract
Optical communication using modulated LEDs (e.g., visible light communication) is an emerging application for event cameras, thanks to their high spatio-temporal resolutions. Event cameras can be used simply to decode the LED signals and also to localize the camera relative to the LED marker positions. However, there is no public dataset to benchmark the decoding and localization in various real-world settings. We present, to the best of our knowledge, the first public dataset that consists of an event camera, a frame camera, and ground-truth poses that are precisely synchronized with hardware triggers. It provides various camera motions with various sensitivities in different scene brightness settings, both indoor and outdoor. Furthermore, we propose a novel method of localization that leverages the Contrast Maximization framework for motion estimation and compensation. The detailed analysis and experimental results demonstrate the advantages of LED-based localization with events over the conventional AR-marker--based one with frames, as well as the efficacy of the proposed method in localization. We hope that the proposed dataset serves as a future benchmark for both motion-related classical computer vision tasks and LED marker decoding tasks simultaneously, paving the way to broadening applications of event cameras on mobile devices. https://woven-visionai.github.io/evlc-dataset