🤖 AI Summary
Cable-driven parallel robots (CDPRs) suffer from insufficient control robustness under under-constrained configurations and low sampling rates. Method: This work systematically evaluates the trajectory-tracking performance of PID, DDPG, PPO, and TRPO controllers on a real hardware CDPR platform. It innovatively adapts trust-region policy optimization (TRPO) to low-frequency control (≤50 Hz), explicitly constraining policy update steps to enhance training stability and safety while reducing reliance on high-frequency sensing. Contribution/Results: TRPO achieves the lowest root-mean-square tracking error across diverse dynamic trajectories—42% lower than PID and 18% lower than PPO—and demonstrates markedly superior robustness against modeling errors and external disturbances. This study validates the feasibility and advantages of policy-gradient-based reinforcement learning for resource-constrained physical robotic systems.
📝 Abstract
This study evaluates the performance of classical and modern control methods for real-world Cable-Driven Parallel Robots (CDPRs), focusing on underconstrained systems with limited time discretization. A comparative analysis is conducted between classical PID controllers and modern reinforcement learning algorithms, including Deep Deterministic Policy Gradient (DDPG), Proximal Policy Optimization (PPO), and Trust Region Policy Optimization (TRPO). The results demonstrate that TRPO outperforms other methods, achieving the lowest root mean square (RMS) errors across various trajectories and exhibiting robustness to larger time intervals between control updates. TRPO's ability to balance exploration and exploitation enables stable control in noisy, real-world environments, reducing reliance on high-frequency sensor feedback and computational demands. These findings highlight TRPO's potential as a robust solution for complex robotic control tasks, with implications for dynamic environments and future applications in sensor fusion or hybrid control strategies.