InsHuman: Towards Natural and Identity-Preserving Human Insertion

📅 2026-05-08
📈 Citations: 0
Influential: 0
📄 PDF

career value

204K/year
🤖 AI Summary
This work addresses key challenges in image editing—such as pose misalignment, inconsistent subject count, and facial identity distortion—when inserting specific individuals into target scenes, compounded by the absence of publicly available datasets featuring full-body human figures with realistic human–scene interactions. To overcome these limitations, the authors propose InsHuman, a novel framework that introduces a Human-Background Adaptive Fusion (HBAF) module for region-aware scene integration and a Face Identity Preservation (FFIP) mechanism to maintain consistent identity. Furthermore, they construct BDP-InsHuman, the first high-quality dataset capturing authentic human–scene interactions. Leveraging foreground detection, region alignment, and bidirectional data pairing, the method significantly enhances the realism and plausibility of synthesized images while preserving subject identity.
📝 Abstract
Human insertion aims to naturally place specific individuals into a target background. Although existing image editing models may have such ability, they often produce failure cases, including inappropriate human pose in new background, inconsistent number of people, and modified facial identity. Moreover, publicly available human datasets often lack full-body portraits and realistic physical interaction between humans and their background. To address these challenges, we propose InsHuman for natural and identity-preserving human insertion. Specifically, we propose Human-Background Adaptive Fusion (HBAF), which detects foreground humans to obtain a binary mask and applies region-aware weighting to align the human regions between predicted and ground-truth latents, ensuring the person's pose, count, and overall appearance are coherently adapted to the target background.We further propose Face-to-Face ID-Preserving (FFIP), which detects and matches faces between the generated image and the source image in terms of face recognition features to enforce identity consistency for each face.In addition, we propose Bidirectional Data Pairing (BDP) strategy to construct BDP-InsHuman, a high-quality dataset with realistic human-background interactions. Experiments demonstrate that InsHuman achieves significant improvements in generating plausible images while keeping human identity unchanged.
Problem

Research questions and friction points this paper is trying to address.

human insertion
identity preservation
pose consistency
human-background interaction
facial identity
Innovation

Methods, ideas, or system contributions that make the work stand out.

Human Insertion
Identity Preservation
Human-Background Interaction
Face Recognition
Image Synthesis
🔎 Similar Papers
No similar papers found.