XSpecMesh: Quality-Preserving Auto-Regressive Mesh Generation Acceleration via Multi-Head Speculative Decoding

📅 2025-07-31
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Autoregressive mesh generation models suffer from high inference latency and low efficiency due to requiring thousands to tens of thousands of sequential token predictions. To address this, we propose Multi-head Speculative Decoding, which parallelly predicts multiple candidate vertex/patch tokens under topological constraints, and integrates a lightweight geometric validator with a dynamic resampling mechanism to ensure geometric fidelity. Furthermore, we introduce a knowledge distillation–based decoding head training paradigm that jointly optimizes probability distribution alignment, significantly reducing validation overhead. Our method maintains identical topological validity and reconstruction quality—measured by Chamfer Distance (CD) and F-Score—while achieving an average 1.7× inference speedup across multiple benchmarks. This work establishes a scalable pathway toward real-time, high-fidelity 3D mesh generation.

Technology Category

Application Category

📝 Abstract
Current auto-regressive models can generate high-quality, topologically precise meshes; however, they necessitate thousands-or even tens of thousands-of next-token predictions during inference, resulting in substantial latency. We introduce XSpecMesh, a quality-preserving acceleration method for auto-regressive mesh generation models. XSpecMesh employs a lightweight, multi-head speculative decoding scheme to predict multiple tokens in parallel within a single forward pass, thereby accelerating inference. We further propose a verification and resampling strategy: the backbone model verifies each predicted token and resamples any tokens that do not meet the quality criteria. In addition, we propose a distillation strategy that trains the lightweight decoding heads by distilling from the backbone model, encouraging their prediction distributions to align and improving the success rate of speculative predictions. Extensive experiments demonstrate that our method achieves a 1.7x speedup without sacrificing generation quality. Our code will be released.
Problem

Research questions and friction points this paper is trying to address.

Accelerate auto-regressive mesh generation with low latency
Maintain high-quality mesh output during fast generation
Use speculative decoding for parallel token prediction
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-head speculative decoding for parallel token prediction
Verification and resampling to ensure quality criteria
Distillation strategy to align prediction distributions
🔎 Similar Papers
No similar papers found.
D
Dian Chen
Key Laboratory of Multimedia Trusted Perception and Efficient Computing, Ministry of Education of China, Xiamen University
Yansong Qu
Yansong Qu
Purdue University-West Lafayette
Intelligent TransportationAutonomous Driving
X
Xinyang Li
Key Laboratory of Multimedia Trusted Perception and Efficient Computing, Ministry of Education of China, Xiamen University
M
Ming Li
Shandong Inspur Database Technology Co.,Ltd.
Shengchuan Zhang
Shengchuan Zhang
Xiamen University
computer visionmachine learning