🤖 AI Summary
Traditional radar segmentation primarily focuses on classifying moving objects, yet class predictions in the radar domain suffer from limited reliability; meanwhile, distinguishing static from dynamic objects is a fundamental prerequisite for vehicle perception. This paper proposes the first end-to-end neural network framework that jointly performs two tasks directly from raw radar point clouds: (1) static/dynamic object segmentation, and (2) 2D ego-motion velocity estimation via retrieval over static objects. Our method integrates MLPs and RNNs into a lightweight spatiotemporal feature extraction module, requiring no preprocessing. Evaluated on the RadarScenes dataset, the framework achieves high performance on both tasks. We introduce novel evaluation metrics and, for the first time, empirically demonstrate that raw radar point clouds alone can support joint perception of static/dynamic semantics and ego-motion—establishing a new paradigm for real-time radar-based automotive perception.
📝 Abstract
Conventional radar segmentation research has typically focused on learning category labels for different moving objects. Although fundamental differences between radar and optical sensors lead to differences in the reliability of predicting accurate and consistent category labels, a review of common radar perception tasks in automotive reveals that determining whether an object is moving or static is a prerequisite for most tasks. To fill this gap, this study proposes a neural network based solution that can simultaneously segment static and moving objects from radar point clouds. Furthermore, since the measured radial velocity of static objects is correlated with the motion of the radar, this approach can also estimate the instantaneous 2D velocity of the moving platform or vehicle (ego motion). However, despite performing dual tasks, the proposed method employs very simple yet effective building blocks for feature extraction: multi layer perceptrons (MLPs) and recurrent neural networks (RNNs). In addition to being the first of its kind in the literature, the proposed method also demonstrates the feasibility of extracting the information required for the dual task directly from unprocessed point clouds, without the need for cloud aggregation, Doppler compensation, motion compensation, or any other intermediate signal processing steps. To measure its performance, this study introduces a set of novel evaluation metrics and tests the proposed method using a challenging real world radar dataset, RadarScenes. The results show that the proposed method not only performs well on the dual tasks, but also has broad application potential in other radar perception tasks.