🤖 AI Summary
Point clouds suffer from directional ambiguity under arbitrary rotations, hindering effective multi-scale directional feature modeling. To address this, we propose DiPVNet—a novel deep architecture that introduces the Atomic Pointwise Dot-product (APD) operator, theoretically proven to be rotation-invariant. DiPVNet unifies rotation invariance and directional sensitivity by jointly integrating learnable local pointwise dot-products (L2DP) and a Direction-Aware Spherical Fourier Transform (DASFT). The network constructs a hierarchical local-to-global directional representation, leveraging generalized harmonic analysis and multi-level feature aggregation. Evaluated under challenging conditions—including severe noise and large-angle rotations—DiPVNet achieves state-of-the-art performance on both 3D classification and semantic segmentation benchmarks. It demonstrates significantly enhanced robustness and generalization capability compared to existing methods, establishing a new paradigm for rotation-equivariant geometric deep learning on point clouds.
📝 Abstract
Point cloud processing has become a cornerstone technology in many 3D vision tasks. However, arbitrary rotations introduce variations in point cloud orientations, posing a long-standing challenge for effective representation learning. The core of this issue is the disruption of the point cloud's intrinsic directional characteristics caused by rotational perturbations. Recent methods attempt to implicitly model rotational equivariance and invariance, preserving directional information and propagating it into deep semantic spaces. Yet, they often fall short of fully exploiting the multiscale directional nature of point clouds to enhance feature representations. To address this, we propose the Direction-Perceptive Vector Network (DiPVNet). At its core is an atomic dot-product operator that simultaneously encodes directional selectivity and rotation invariance--endowing the network with both rotational symmetry modeling and adaptive directional perception. At the local level, we introduce a Learnable Local Dot-Product (L2DP) Operator, which enables interactions between a center point and its neighbors to adaptively capture the non-uniform local structures of point clouds. At the global level, we leverage generalized harmonic analysis to prove that the dot-product between point clouds and spherical sampling vectors is equivalent to a direction-aware spherical Fourier transform (DASFT). This leads to the construction of a global directional response spectrum for modeling holistic directional structures. We rigorously prove the rotation invariance of both operators. Extensive experiments on challenging scenarios involving noise and large-angle rotations demonstrate that DiPVNet achieves state-of-the-art performance on point cloud classification and segmentation tasks. Our code is available at https://github.com/wxszreal0/DiPVNet.