Mixed Precision PointPillars for Efficient 3D Object Detection with TensorRT

📅 2026-01-19

📈 Citations: 0

✨ Influential: 0

career value

179K/year

🤖 AI Summary

This work addresses the challenge of balancing real-time efficiency and accuracy in LiDAR-based 3D object detection, where naive quantization often degrades performance due to wide data distributions and prevalent outliers. Focusing on PointPillars, the authors propose a mixed-precision quantization framework that identifies sensitivity-critical layers via per-layer sensitivity analysis—preserving them in floating-point while quantizing the rest to INT8. Leveraging an extremely small calibration set to mitigate outlier effects, the approach supports both post-training quantization (PTQ) for training-free deployment and quantization-aware training (QAT) for fine-tuning. By innovatively integrating sensitivity-guided layer selection, greedy-search-based precision allocation, and an efficient calibration strategy, the method significantly alleviates quantization-induced accuracy loss without full retraining. Implemented on TensorRT, it achieves up to 2.35× inference acceleration and 2.26× model compression, with the QAT variant closely matching the full-precision baseline performance.

Technology Category

Application Category

📝 Abstract

LIDAR 3D object detection is one of the important tasks for autonomous vehicles. Ensuring that this task operates in real-time is crucial. Toward this, model quantization can be used to accelerate the runtime. However, directly applying model quantization often leads to performance degradation due to LIDAR's wide numerical distributions and extreme outliers. To address the wide numerical distribution, we proposed a mixed precision framework designed for PointPillars. Our framework first searches for sensitive layers with post-training quantization (PTQ) by quantizing one layer at a time to 8-bit integer (INT8) and evaluating each model for average precision (AP). The top-k most sensitive layers are assigned as floating point (FP). Combinations of these layers are greedily searched to produce candidate mixed precision models, which are finalized with either PTQ or quantization-aware training (QAT). Furthermore, to handle outliers, we observe that using a very small number of calibration data reduces the likelihood of encountering outliers, thereby improving PTQ performance. Our methods provides mixed precision models without training in the PTQ pipeline, while our QAT pipeline achieves the performance competitive to FP models. With TensorRT deployment, our mixed precision models offer less latency by up to 2.538 times compared to FP32 models.

Problem

Research questions and friction points this paper is trying to address.

3D object detection

model quantization

numerical distribution

outliers

real-time inference

Innovation

Methods, ideas, or system contributions that make the work stand out.

Mixed Precision Quantization

PointPillars

Post-Training Quantization (PTQ)