BlinkTrack: Feature Tracking over 100 FPS via Events and Images

📅 2024-09-26
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Event cameras offer high temporal resolution and motion-blur immunity but suffer from insufficient texture information, leading to error accumulation in feature tracking and limiting their applicability in Structure-from-Motion (SfM) and Simultaneous Localization and Mapping (SLAM). To address this, we propose a learnable, differentiable Kalman filtering framework with a dual-branch architecture—the first to enable asynchronous, cross-modal, end-to-end fusion of event streams and RGB images. We further introduce a synthetic and domain-augmented dataset specifically designed to mitigate texture scarcity, reducing training bias. Our method achieves real-time performance: >100 FPS during event preprocessing and 80 FPS for multimodal feature tracking—significantly outperforming existing event-driven approaches. Extensive evaluation demonstrates substantial improvements in both tracking accuracy and robustness under challenging conditions, including low illumination and high-speed motion.

Technology Category

Application Category

📝 Abstract
Feature tracking is crucial for, structure from motion (SFM), simultaneous localization and mapping (SLAM), object tracking and various computer vision tasks. Event cameras, known for their high temporal resolution and ability to capture asynchronous changes, have gained significant attention for their potential in feature tracking, especially in challenging conditions. However, event cameras lack the fine-grained texture information that conventional cameras provide, leading to error accumulation in tracking. To address this, we propose a novel framework, BlinkTrack, which integrates event data with RGB images for high-frequency feature tracking. Our method extends the traditional Kalman filter into a learning-based framework, utilizing differentiable Kalman filters in both event and image branches. This approach improves single-modality tracking, resolves ambiguities, and supports asynchronous data fusion. We also introduce new synthetic and augmented datasets to better evaluate our model. Experimental results indicate that BlinkTrack significantly outperforms existing event-based methods, exceeding 100 FPS with preprocessed event data and 80 FPS with multi-modality data.
Problem

Research questions and friction points this paper is trying to address.

Integrates event and image data for high-frequency feature tracking
Addresses error accumulation in event camera tracking
Improves data association and fusion from asynchronous modalities
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates event data with grayscale images
Uses learning-based differentiable Kalman filters
Introduces new synthetic and augmented datasets
🔎 Similar Papers
No similar papers found.
Y
Yichen Shen
State Key Lab of CAD&CG, Zhejiang University
Yijin Li
Yijin Li
State Key Lab of CAD&CG, Zhejiang University, China
Computer Vision
S
Shuo Chen
State Key Lab of CAD&CG, Zhejiang University
G
Guanglin Li
State Key Lab of CAD&CG, Zhejiang University
Zhaoyang Huang
Zhaoyang Huang
Chinese University of Hong Kong
computer vision
H
Hujun Bao
State Key Lab of CAD&CG, Zhejiang University
Zhaopeng Cui
Zhaopeng Cui
Zhejiang University
Computer VisionRoboticsComputer Graphics
G
Guofeng Zhang
State Key Lab of CAD&CG, Zhejiang University