🤖 AI Summary
To address poor robustness and semantic deficiency in 3D reconstruction under agricultural conditions—characterized by non-uniform illumination, severe occlusion, and limited field-of-view—this paper proposes NIRSplat, the first multimodal 3D Gaussian Splatting framework integrating near-infrared (NIR) imagery with textual metadata of vegetation indices (NDVI, NDWI, chlorophyll index). Methodologically, we introduce NIRPlant, the first multimodal agricultural dataset comprising synchronized NIR/RGB images, LiDAR point clouds, and structured textual vegetation metadata; design a cross-modal fusion architecture leveraging cross-attention mechanisms and 3D positional encoding; and incorporate geometric priors to regularize optimization. Experiments demonstrate that NIRSplat significantly outperforms 3DGS, CoR-GS, and InstantSplat in reconstruction completeness and geometric accuracy, while enhancing semantic interpretability. The code and NIRPlant dataset are publicly released.
📝 Abstract
While 3D Gaussian Splatting (3DGS) has rapidly advanced, its application in agriculture remains underexplored. Agricultural scenes present unique challenges for 3D reconstruction methods, particularly due to uneven illumination, occlusions, and a limited field of view. To address these limitations, we introduce extbf{NIRPlant}, a novel multimodal dataset encompassing Near-Infrared (NIR) imagery, RGB imagery, textual metadata, Depth, and LiDAR data collected under varied indoor and outdoor lighting conditions. By integrating NIR data, our approach enhances robustness and provides crucial botanical insights that extend beyond the visible spectrum. Additionally, we leverage text-based metadata derived from vegetation indices, such as NDVI, NDWI, and the chlorophyll index, which significantly enriches the contextual understanding of complex agricultural environments. To fully exploit these modalities, we propose extbf{NIRSplat}, an effective multimodal Gaussian splatting architecture employing a cross-attention mechanism combined with 3D point-based positional encoding, providing robust geometric priors. Comprehensive experiments demonstrate that extbf{NIRSplat} outperforms existing landmark methods, including 3DGS, CoR-GS, and InstantSplat, highlighting its effectiveness in challenging agricultural scenarios. The code and dataset are publicly available at: https://github.com/StructuresComp/3D-Reconstruction-NIR