Reinforcement Learning for Control Systems with Time Delays: A Comprehensive Survey

📅 2026-01-30

📈 Citations: 0

✨ Influential: 0

career value

225K/year

🤖 AI Summary

Time delays compromise the Markov property of control systems, leading to degraded reinforcement learning performance and heightened stability risks. This work presents the first systematic survey and classification of five categories of reinforcement learning approaches designed to address perception, actuation, and communication delays: state augmentation, recurrent policies, predictor-augmented modeling, robust domain randomization, and safety-constrained reinforcement learning. By framing these methods within a unified perspective, the study compares their applicability and inherent trade-offs, offering practical guidelines for method selection. Furthermore, it identifies critical open challenges—such as stability certification and multi-agent coordination—and outlines promising directions for future research, thereby providing a theoretical foundation for designing reliable controllers in delay-prone environments.

Technology Category

Application Category

📝 Abstract

In the last decade, Reinforcement Learning (RL) has achieved remarkable success in the control and decision-making of complex dynamical systems. However, most RL algorithms rely on the Markov Decision Process assumption, which is violated in practical cyber-physical systems affected by sensing delays, actuation latencies, and communication constraints. Such time delays introduce memory effects that can significantly degrade performance and compromise stability, particularly in networked and multi-agent environments. This paper presents a comprehensive survey of RL methods designed to address time delays in control systems. We first formalize the main classes of delays and analyze their impact on the Markov property. We then systematically categorize existing approaches into five major families: state augmentation and history-based representations, recurrent policies with learned memory, predictor-based and model-aware methods, robust and domain-randomized training strategies, and safe RL frameworks with explicit constraint handling. For each family, we discuss underlying principles, practical advantages, and inherent limitations. A comparative analysis highlights key trade-offs among these approaches and provides practical guidelines for selecting suitable methods under different delay characteristics and safety requirements. Finally, we identify open challenges and promising research directions, including stability certification, large-delay learning, multi-agent communication co-design, and standardized benchmarking. This survey aims to serve as a unified reference for researchers and practitioners developing reliable RL-based controllers in delay-affected cyber-physical systems.

Problem

Research questions and friction points this paper is trying to address.

time delays

reinforcement learning

Markov Decision Process

cyber-physical systems

control systems

Innovation

Methods, ideas, or system contributions that make the work stand out.

Reinforcement Learning

Time Delays

Control Systems