From Cradle to Cane: A Two-Pass Framework for High-Fidelity Lifespan Face Aging

📅 2025-06-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing methods for lifelong face age transformation struggle to balance age accuracy and identity preservation (the Age-ID trade-off), particularly under large age spans and extreme poses, leading to severe artifacts. This paper proposes Cradle2Cane, a two-stage diffusion framework: Stage I introduces Adaptive Noise Injection (AdaNI) to enhance age controllability; Stage II jointly integrates SVR-ArcFace and Rotate-CLIP—two complementary identity embeddings—into a few-step text-to-image diffusion model for fine-grained, co-optimized age-identity modeling. End-to-end joint training ensures photorealistic aging effects while strongly preserving identity. Evaluated on CelebA-HQ, our method achieves significant improvements over state-of-the-art approaches in both age prediction accuracy (measured by Face++) and identity similarity (assessed by Qwen-VL), demonstrating superior performance in holistic age transformation with robust identity fidelity.

Technology Category

Application Category

📝 Abstract
Face aging has become a crucial task in computer vision, with applications ranging from entertainment to healthcare. However, existing methods struggle with achieving a realistic and seamless transformation across the entire lifespan, especially when handling large age gaps or extreme head poses. The core challenge lies in balancing age accuracy and identity preservation--what we refer to as the Age-ID trade-off. Most prior methods either prioritize age transformation at the expense of identity consistency or vice versa. In this work, we address this issue by proposing a two-pass face aging framework, named Cradle2Cane, based on few-step text-to-image (T2I) diffusion models. The first pass focuses on solving age accuracy by introducing an adaptive noise injection (AdaNI) mechanism. This mechanism is guided by including prompt descriptions of age and gender for the given person as the textual condition. Also, by adjusting the noise level, we can control the strength of aging while allowing more flexibility in transforming the face. However, identity preservation is weakly ensured here to facilitate stronger age transformations. In the second pass, we enhance identity preservation while maintaining age-specific features by conditioning the model on two identity-aware embeddings (IDEmb): SVR-ArcFace and Rotate-CLIP. This pass allows for denoising the transformed image from the first pass, ensuring stronger identity preservation without compromising the aging accuracy. Both passes are jointly trained in an end-to-end way. Extensive experiments on the CelebA-HQ test dataset, evaluated through Face++ and Qwen-VL protocols, show that our Cradle2Cane outperforms existing face aging methods in age accuracy and identity consistency.
Problem

Research questions and friction points this paper is trying to address.

Achieving realistic lifespan face aging with large age gaps
Balancing age accuracy and identity preservation trade-off
Handling extreme head poses in face aging transformations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Two-pass face aging framework
Adaptive noise injection mechanism
Identity-aware embeddings for preservation
🔎 Similar Papers
No similar papers found.
T
Tao Liu
VCIP, College of Computer Science, Nankai University
Dafeng Zhang
Dafeng Zhang
Samsung Research China – Beijing (SRC-B)
computer visionlow-level visionface
G
Gengchen Li
School of Electrical and Information Engineering, Zhengzhou University
S
Shizhuo Liu
Samsung Research China - Beijing (SRC-B)
Y
Yongqi Song
Samsung Research China - Beijing (SRC-B)
Senmao Li
Senmao Li
Ph.D Student, Nankai University
GANsImage-to-image translationDiffusion Models
S
Shiqi Yang
SB Intuitions, SoftBank
B
Boqian Li
School of Computer, Zhengzhou University of Aeronautics
K
Kai Wang
Computer Vision Center, Universitat Autónoma de Barcelona
Yaxing Wang
Yaxing Wang
Associate professor, Nankai University
Deep learningGANsImage-to-image translationTransfer learning