Lightweight Temporal Transformer Decomposition for Federated Autonomous Driving

📅 2025-06-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Single-frame visual inputs exhibit poor robustness in complex scenes, while state-of-the-art temporal models suffer from excessive computational overhead, hindering their deployment in federated learning (FL) settings. Method: We propose a lightweight Temporal Transformer decomposition framework that factorizes the global attention matrix into low-rank components, drastically reducing parameter count and computational complexity. We further design an FL-aware distributed training strategy enabling efficient parameter aggregation and real-time inference. The model jointly processes multi-frame images and steering sequences to achieve accurate temporal modeling and cross-modal feature fusion under resource constraints. Results: Our approach outperforms existing SOTA methods on three benchmark datasets, achieving significant accuracy gains and inference latency below 30 ms. Extensive real-world robotic experiments validate its practical deployability and strong generalization capability in heterogeneous edge environments.

Technology Category

Application Category

📝 Abstract
Traditional vision-based autonomous driving systems often face difficulties in navigating complex environments when relying solely on single-image inputs. To overcome this limitation, incorporating temporal data such as past image frames or steering sequences, has proven effective in enhancing robustness and adaptability in challenging scenarios. While previous high-performance methods exist, they often rely on resource-intensive fusion networks, making them impractical for training and unsuitable for federated learning. To address these challenges, we propose lightweight temporal transformer decomposition, a method that processes sequential image frames and temporal steering data by breaking down large attention maps into smaller matrices. This approach reduces model complexity, enabling efficient weight updates for convergence and real-time predictions while leveraging temporal information to enhance autonomous driving performance. Intensive experiments on three datasets demonstrate that our method outperforms recent approaches by a clear margin while achieving real-time performance. Additionally, real robot experiments further confirm the effectiveness of our method.
Problem

Research questions and friction points this paper is trying to address.

Enhance autonomous driving in complex environments using temporal data
Reduce resource-intensive fusion networks for federated learning compatibility
Achieve real-time performance with lightweight temporal transformer decomposition
Innovation

Methods, ideas, or system contributions that make the work stand out.

Lightweight temporal transformer decomposition method
Breaks large attention maps into smaller matrices
Enables efficient federated learning and real-time predictions
🔎 Similar Papers
No similar papers found.
T
Tuong Do
AIOZ, Singapore
Binh X. Nguyen
Binh X. Nguyen
AI Researcher at AIOZ
Computer ScienceComputer VisionMachine Learning
Quang D. Tran
Quang D. Tran
Research Scientist, University of Liverpool
Machine LearningComputer VisionRoboticsFederated LearningData Science
E
Erman Tjiputra
Department of Computer Science, NTHU, Taiwan
T
Te-Chuan Chiu
AIOZ, Singapore
A
Anh Nguyen
Department of Computer Science, University of Liverpool, UK