🤖 AI Summary
This work proposes the Visual Personalized Turing Test (VPTT), a novel framework for evaluating whether generated content is perceptually indistinguishable from what a specific user might create or share, without replicating their identity. The framework comprises a large-scale character benchmark (VPTT-Bench), a vision-augmented retrieval-based generator (VPRAG), and a text-based evaluation metric, the VPTT Score. Centered on perceptual indistinguishability, this paradigm enables scalable and privacy-preserving assessment of personalized generation. Experimental results demonstrate that the VPTT Score exhibits strong alignment with both human judgments and visual-language model evaluations, while VPRAG achieves an optimal trade-off between stylistic alignment and originality.
📝 Abstract
We introduce the Visual Personalization Turing Test (VPTT), a new paradigm for evaluating contextual visual personalization based on perceptual indistinguishability, rather than identity replication. A model passes the VPTT if its output (image, video, 3D asset, etc.) is indistinguishable to a human or calibrated VLM judge from content a given person might plausibly create or share. To operationalize VPTT, we present the VPTT Framework, integrating a 10k-persona benchmark (VPTT-Bench), a visual retrieval-augmented generator (VPRAG), and the VPTT Score, a text-only metric calibrated against human and VLM judgments. We show high correlation across human, VLM, and VPTT evaluations, validating the VPTT Score as a reliable perceptual proxy. Experiments demonstrate that VPRAG achieves the best alignment-originality balance, offering a scalable and privacy-safe foundation for personalized generative AI.