🤖 AI Summary
This work investigates how a single vector sketch can undergo a dramatic semantic transformation through the progressive addition of strokes while maintaining dual semantic consistency between the initial and target concepts at every intermediate stage. To this end, the authors propose a sequence-aware joint optimization framework that extends visual illusion from the spatial to the temporal domain, introducing the novel concept of “progressive semantic illusion” and uncovering a “shared structural subspace” that underpins dual semantics. Methodologically, they design a two-branch Score Distillation Sampling mechanism coupled with a new Overlay Loss to jointly optimize the stroke sequence under spatial complementarity constraints. Experiments demonstrate that the proposed approach significantly outperforms existing methods in both recognizability and illusion strength, successfully generating vector sketch sequences with compelling semantic transitions.
📝 Abstract
Visual illusions traditionally rely on spatial manipulations such as multi-view consistency. In this work, we introduce Progressive Semantic Illusions, a novel vector sketching task where a single sketch undergoes a dramatic semantic transformation through the sequential addition of strokes. We present Stroke of Surprise, a generative framework that optimizes vector strokes to satisfy distinct semantic interpretations at different drawing stages. The core challenge lies in the"dual-constraint": initial prefix strokes must form a coherent object (e.g., a duck) while simultaneously serving as the structural foundation for a second concept (e.g., a sheep) upon adding delta strokes. To address this, we propose a sequence-aware joint optimization framework driven by a dual-branch Score Distillation Sampling (SDS) mechanism. Unlike sequential approaches that freeze the initial state, our method dynamically adjusts prefix strokes to discover a"common structural subspace"valid for both targets. Furthermore, we introduce a novel Overlay Loss that enforces spatial complementarity, ensuring structural integration rather than occlusion. Extensive experiments demonstrate that our method significantly outperforms state-of-the-art baselines in recognizability and illusion strength, successfully expanding visual anagrams from the spatial to the temporal dimension. Project page: https://stroke-of-surprise.github.io/