ADHMR: Aligning Diffusion-based Human Mesh Recovery via Direct Preference Optimization

📅 2025-05-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Single-image human mesh recovery (HMR) suffers from two key challenges: depth ambiguity and occlusion-induced misalignment between 2D observations and 3D geometry, as well as poor robustness on in-the-wild images. To address these without requiring 3D ground-truth annotations, this work proposes the first preference-optimized diffusion-based alignment framework for HMR. Our method comprises three core components: (1) a lightweight HMR-Scorer network to generate high-quality preference data; (2) the first application of Direct Preference Optimization (DPO) to HMR, enabling zero-supervision fine-tuning and confidence-driven data cleaning; and (3) preference-guided diffusion generation, which significantly improves geometric consistency and cross-dataset generalization. Extensive experiments demonstrate state-of-the-art performance across multiple in-the-wild benchmarks. Notably, our approach enhances existing HMR models effectively with only a small number of samples, offering a scalable and annotation-free paradigm for robust 3D human pose and shape estimation.

Technology Category

Application Category

📝 Abstract
Human mesh recovery (HMR) from a single image is inherently ill-posed due to depth ambiguity and occlusions. Probabilistic methods have tried to solve this by generating numerous plausible 3D human mesh predictions, but they often exhibit misalignment with 2D image observations and weak robustness to in-the-wild images. To address these issues, we propose ADHMR, a framework that Aligns a Diffusion-based HMR model in a preference optimization manner. First, we train a human mesh prediction assessment model, HMR-Scorer, capable of evaluating predictions even for in-the-wild images without 3D annotations. We then use HMR-Scorer to create a preference dataset, where each input image has a pair of winner and loser mesh predictions. This dataset is used to finetune the base model using direct preference optimization. Moreover, HMR-Scorer also helps improve existing HMR models by data cleaning, even with fewer training samples. Extensive experiments show that ADHMR outperforms current state-of-the-art methods. Code is available at: https://github.com/shenwenhao01/ADHMR.
Problem

Research questions and friction points this paper is trying to address.

Addressing misalignment in 3D human mesh predictions from images
Improving robustness of mesh recovery for in-the-wild images
Enhancing existing HMR models via preference-based data cleaning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Aligns Diffusion-based HMR via preference optimization
Uses HMR-Scorer for prediction assessment
Improves models via data cleaning
🔎 Similar Papers
No similar papers found.
Wenhao Shen
Wenhao Shen
Nanyang Technological University
Computer Vision3D Vision
Wanqi Yin
Wanqi Yin
SenseTime Research
Computer VisionMotion CaptureDigital Human
X
Xiaofeng Yang
Nanyang Technological University
C
Cheng Chen
Nanyang Technological University
Chaoyue Song
Chaoyue Song
Nanyang Technological University
3D Vision
Z
Zhongang Cai
SenseTime Research
L
Lei Yang
SenseTime Research
H
Hao Wang
The Hong Kong University of Science and Technology (Guangzhou)
Guosheng Lin
Guosheng Lin
Nanyang Technological University
Computer VisionMachine Learning