🤖 AI Summary
To address the challenge of balancing physical interpretability and data-driven generalizability in dynamical modeling of soft continuum robots (SCRs), this paper proposes an end-to-end vision-based dynamics learning framework. Methodologically, it integrates an autoencoder latent-space model, an attention-based broadcast decoder (ABCD), and a 2D coupled oscillator network—enabling autonomous discovery of oscillator-chain topology without structural priors. A Koopman operator is incorporated to enhance linear representability, while visual attention localization and background filtering facilitate pixel-wise visualization of physical quantities (e.g., mass, stiffness, applied forces). Evaluated on single- and double-segment SCR video data, the framework reduces multi-step prediction error by 5.7× versus conventional Koopman methods and by 3.5× versus standard oscillator models. It further achieves, for the first time, prior-free dynamic structural visualization and out-of-distribution latent-space extrapolation.
📝 Abstract
Data-driven learning of soft continuum robot (SCR) dynamics from high-dimensional observations offers flexibility but often lacks physical interpretability, while model-based approaches require prior knowledge and can be computationally expensive. We bridge this gap by introducing (1) the Attention Broadcast Decoder (ABCD), a plug-and-play module for autoencoder-based latent dynamics learning that generates pixel-accurate attention maps localizing each latent dimension's contribution while filtering static backgrounds. (2) By coupling these attention maps to 2D oscillator networks, we enable direct on-image visualization of learned dynamics (masses, stiffness, and forces) without prior knowledge. We validate our approach on single- and double-segment SCRs, demonstrating that ABCD-based models significantly improve multi-step prediction accuracy: 5.7x error reduction for Koopman operators and 3.5x for oscillator networks on the two-segment robot. The learned oscillator network autonomously discovers a chain structure of oscillators. Unlike standard methods, ABCD models enable smooth latent space extrapolation beyond training data. This fully data-driven approach yields compact, physically interpretable models suitable for control applications.