Implicit Preference Alignment for Human Image Animation

📅 2026-05-08

📈 Citations: 0

✨ Influential: 0

career value

212K/year

🤖 AI Summary

This work addresses the challenge of generating high-fidelity hand motions in human image animation, which is hindered by the hands’ high degrees of freedom and structural complexity, as well as the prohibitive cost of collecting frame-level explicit preference pairs. To overcome this, the authors propose an efficient post-training framework that operates without paired preference data. The method implicitly maximizes reward by maximizing the likelihood of self-generated high-quality samples while constraining deviations from a pretrained prior. Additionally, it introduces a hand-aware local refinement mechanism that explicitly guides alignment in hand regions. This approach pioneers an implicit preference alignment strategy, substantially reducing the need for costly preference data collection and annotation, while simultaneously enhancing hand motion quality and enabling efficient preference optimization.

📝 Abstract

Human image animation has witnessed significant advancements, yet generating high-fidelity hand motions remains a persistent challenge due to their high degrees of freedom and motion complexity. While reinforcement learning from human feedback, particularly direct preference optimization, offers a potential solution, it necessitates the construction of strict preference pairs. However, curating such pairs for dynamic hand regions is prohibitively expensive and often impractical due to frame-wise inconsistencies. In this paper, we propose Implicit Preference Alignment (IPA), a data-efficient post-training framework that eliminates the need for paired preference data. Theoretically grounded in implicit reward maximization, IPA aligns the model by maximizing the likelihood of self-generated high-quality samples while penalizing deviations from the pretrained prior. Furthermore, we introduce a Hand-Aware Local Optimization mechanism to explicitly steer the alignment process toward hand regions. Experiments demonstrate that our method achieves effective preference optimization to enhance hand generation quality, while significantly lowering the barrier for constructing preference data. Codes are released at https://github.com/mdswyz/IPA

Problem

Research questions and friction points this paper is trying to address.

human image animation

hand motion generation

preference alignment

high-fidelity animation

data-efficient learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Implicit Preference Alignment

Hand-Aware Local Optimization

Preference-Free Alignment