Single-Eye View: Monocular Real-time Perception Package for Autonomous Driving

๐Ÿ“… 2026-03-22
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work proposes LRHPerception, a real-time, multi-task perception system for autonomous driving that operates on monocular video input. Addressing the common trade-off between computational efficiency and performance in camera-based approaches, LRHPerception integrates the efficiency of end-to-end learning with the representational power of local mapping within a unified framework. The system jointly performs object detection, trajectory prediction, semantic road segmentation, and pixel-wise depth estimation, producing a structured five-channel perception tensor. Requiring only a single viewpoint, it achieves a real-time inference speed of 29 FPS on a single GPUโ€”555% faster than the current fastest map-building methodโ€”while maintaining state-of-the-art perception accuracy.

Technology Category

Application Category

๐Ÿ“ Abstract
Amidst the rapid advancement of camera-based autonomous driving technology, effectiveness is often prioritized with limited attention to computational efficiency. To address this issue, this paper introduces LRHPerception, a real-time monocular perception package for autonomous driving that uses single-view camera video to interpret the surrounding environment. The proposed system combines the computational efficiency of end-to-end learning with the rich representational detail of local mapping methodologies. With significant improvements in object tracking and prediction, road segmentation, and depth estimation integrated into a unified framework, LRHPerception processes monocular image data into a five-channel tensor consisting of RGB, road segmentation, and pixel-level depth estimation, augmented with object detection and trajectory prediction. Experimental results demonstrate strong performance, achieving real-time processing at 29 FPS on a single GPU, representing a 555% speedup over the fastest mapping-based approach.
Problem

Research questions and friction points this paper is trying to address.

autonomous driving
monocular perception
computational efficiency
real-time processing
camera-based perception
Innovation

Methods, ideas, or system contributions that make the work stand out.

monocular perception
real-time processing
end-to-end learning
unified perception framework
autonomous driving
๐Ÿ”Ž Similar Papers
No similar papers found.
H
Haixi Zhang
Department of Electrical and Computer Engineering, University of Rochester, Rochester, NY 14627, USA
A
Aiyinsi Zuo
Department of Electrical and Computer Engineering, University of Rochester, Rochester, NY 14627, USA
Z
Zirui Li
Department of Electrical and Computer Engineering, University of Rochester, Rochester, NY 14627, USA
C
Chunshu Wu
Department of Electrical and Computer Engineering, University of Rochester, Rochester, NY 14627, USA
T
Tong Geng
Department of Electrical and Computer Engineering, University of Rochester, Rochester, NY 14627, USA
Zhiyao Duan
Zhiyao Duan
Professor of Electrical and Computer Engineering, University of Rochester
Computer AuditionMusic Information RetrievalSpeech ProcessingAudiovisual Learning