Diffusion Distillation With Direct Preference Optimization For Efficient 3D LiDAR Scene Completion

📅 2025-04-15

📈 Citations: 0

✨ Influential: 0

career value

195K/year

🤖 AI Summary

Diffusion models for 3D LiDAR scene completion suffer from slow sampling, and existing score distillation methods accelerate inference at the cost of degraded performance. Method: This paper proposes Distillation-DPO, the first framework to integrate Direct Preference Optimization (DPO) into diffusion distillation. It constructs preference pairs using non-differentiable LiDAR evaluation metrics—such as Chamfer Distance and F-Score—to guide the student model to approximate the teacher’s score function. A joint optimization mechanism is designed, combining noise-pair generation with score-difference-driven learning. Contribution/Results: By bypassing reliance on differentiable losses, Distillation-DPO enables end-to-end alignment with real-world evaluation objectives. Experiments demonstrate that our method achieves superior or comparable completion quality while accelerating inference by over 5×, significantly outperforming state-of-the-art diffusion-based approaches.

Technology Category

Application Category

📝 Abstract

The application of diffusion models in 3D LiDAR scene completion is limited due to diffusion's slow sampling speed. Score distillation accelerates diffusion sampling but with performance degradation, while post-training with direct policy optimization (DPO) boosts performance using preference data. This paper proposes Distillation-DPO, a novel diffusion distillation framework for LiDAR scene completion with preference aligment. First, the student model generates paired completion scenes with different initial noises. Second, using LiDAR scene evaluation metrics as preference, we construct winning and losing sample pairs. Such construction is reasonable, since most LiDAR scene metrics are informative but non-differentiable to be optimized directly. Third, Distillation-DPO optimizes the student model by exploiting the difference in score functions between the teacher and student models on the paired completion scenes. Such procedure is repeated until convergence. Extensive experiments demonstrate that, compared to state-of-the-art LiDAR scene completion diffusion models, Distillation-DPO achieves higher-quality scene completion while accelerating the completion speed by more than 5-fold. Our method is the first to explore adopting preference learning in distillation to the best of our knowledge and provide insights into preference-aligned distillation. Our code is public available on https://github.com/happyw1nd/DistillationDPO.

Problem

Research questions and friction points this paper is trying to address.

Improving slow sampling speed in 3D LiDAR diffusion models

Enhancing performance with preference-aligned distillation framework

Accelerating scene completion while maintaining high quality

Innovation

Methods, ideas, or system contributions that make the work stand out.

Diffusion distillation with preference alignment

Student model generates paired completion scenes

Optimizes using score function differences

🔎 Similar Papers

DiffSSC: Semantic LiDAR Scan Completion using Denoising Diffusion Probabilistic Models