🤖 AI Summary
Existing image-driven robotic drawing methods neglect biomechanical constraints of human hand/arm motion, yielding stiff trajectories and poor artistic expressiveness—hindering natural human-robot collaboration. To address this, we propose an end-to-end differentiable optimization framework that jointly integrates biologically plausible motion modeling with differentiable rendering. Specifically, we are the first to embed the sigma-lognormal model—a neuroscientifically grounded human hand movement model—into a gradient-based optimization pipeline, coupled with DiffVG for differentiable vector graphics rendering, image-space reconstruction loss, and minimum-time smoothness regularization. This enables synthesis of drawing trajectories that are both physiologically realistic and artistically high-fidelity. Our method significantly improves trajectory fluency and visual naturalness, generating robot-executable, ergonomically sound drawing paths directly deployable on physical manipulators. It achieves state-of-the-art performance in synthetic graffiti generation and image abstraction tasks, establishing a novel paradigm for embodied artistic creation.
📝 Abstract
Large image generation and vision models, combined with differentiable rendering technologies, have become powerful tools for generating paths that can be drawn or painted by a robot. However, these tools often overlook the intrinsic physicality of the human drawing/writing act, which is usually executed with skillful hand/arm gestures. Taking this into account is important for the visual aesthetics of the results and for the development of closer and more intuitive artist-robot collaboration scenarios. We present a method that bridges this gap by enabling gradient-based optimization of natural human-like motions guided by cost functions defined in image space. To this end, we use the sigma-lognormal model of human hand/arm movements, with an adaptation that enables its use in conjunction with a differentiable vector graphics (DiffVG) renderer. We demonstrate how this pipeline can be used to generate feasible trajectories for a robot by combining image-driven objectives with a minimum-time smoothing criterion. We demonstrate applications with generation and robotic reproduction of synthetic graffiti as well as image abstraction.