🤖 AI Summary
Real-time, large-scale physical simulation of inextensible Cosserat elastic rods remains computationally prohibitive due to stringent stability constraints on time step size.
Method: This paper proposes a CUDA-based GPU parallel algorithm featuring cross-block synchronization and a single-kernel, multi-step explicit integration scheme to circumvent conventional time-step limitations—yielding near-constant computational cost per time step. It integrates nonlinear dynamical modeling with memory-access optimizations to maximize parallel efficiency.
Contribution/Results: The method supports both extensible and inextensible rod models and is validated on slender, flexible structures such as cables and guidewires. Compared to CPU implementations, it achieves up to 40× speedup; for cardiovascular catheter–guidewire coupling simulations, it delivers 13.5× acceleration. The framework sustains stable haptic feedback at 0.5–1 kHz, enabling high-fidelity real-time surgical simulation and robotic manipulation.
📝 Abstract
An elastic rod is a long and thin body able to sustain large global deformations, even if local strains are small. The Cosserat rod is a non-linear elastic rod with an oriented centreline, which enables modelling of bending, stretching and twisting deformations. It can be used for physically-based computer simulation of threads, wires, ropes, as well as flexible surgical instruments such as catheters, guidewires or sutures. We present a massively-parallel implementation of the original CoRdE model as well as our inextensible variation. By superseding the CUDA Scalable Programming Model and using inter-block synchronization, we managed to simulate multiple physics time-steps per single kernel launch utilizing all the GPU's streaming multiprocessors. Under some constraints, this results in nearly constant computation time, regardless of the number of Cosserat elements simulated. When executing 10 time-steps per single kernel launch, our implementation of the original, extensible CoRdE was x40.0 faster. In a number of tests, the GPU implementation of our inextensible CoRdE modification achieved an average speed-up of x15.11 over the corresponding CPU version. Simulating a catheter/guidewire pair (2x512 Cosserat elements) in a cardiovascular application resulted in a 13.5 fold performance boost, enabling for accurate real-time simulation at haptic interactive rates (0.5-1kHz).