Ergodic Risk Measures: Towards a Risk-Aware Foundation for Continual Reinforcement Learning

📅 2025-10-03

📈 Citations: 0

✨ Influential: 0

career value

210K/year

🤖 AI Summary

Existing continual reinforcement learning (CRL) frameworks are predominantly risk-neutral, optimizing only the expected long-term return while neglecting tail-risk exposure and policy stability. Method: This work systematically introduces risk measurement theory into CRL, identifying a fundamental incompatibility between classical static risk measures—such as Conditional Value-at-Risk (CVaR)—and the non-stationary, task-evolving nature of continual learning. To address this, we propose Ergodic Risk Measures, grounded in ergodic process theory, to characterize long-term risk exposure under dynamic task sequences. We establish their theoretical consistency and stability, and develop a differentiable, risk-aware policy optimization algorithm. Contribution/Results: Experiments on multiple CRL benchmarks demonstrate significant improvements in policy robustness and long-term performance stability. Our approach provides the first principled theoretical foundation and practical paradigm for risk-aware continual reinforcement learning.

Technology Category

Application Category

📝 Abstract

Continual reinforcement learning (continual RL) seeks to formalize the notions of lifelong learning and endless adaptation in RL. In particular, the aim of continual RL is to develop RL agents that can maintain a careful balance between retaining useful information and adapting to new situations. To date, continual RL has been explored almost exclusively through the lens of risk-neutral decision-making, in which the agent aims to optimize the expected (or mean) long-run performance. In this work, we present the first formal theoretical treatment of continual RL through the lens of risk-aware decision-making, in which the agent aims to optimize a reward-based measure of long-run performance beyond the mean. In particular, we show that the classical theory of risk measures, widely used as a theoretical foundation in non-continual risk-aware RL, is, in its current form, incompatible with the continual setting. Then, building on this insight, we extend risk measure theory into the continual setting by introducing a new class of ergodic risk measures that are compatible with continual learning. Finally, we provide a case study of risk-aware continual learning, along with empirical results, which show the intuitive appeal and theoretical soundness of ergodic risk measures.

Problem

Research questions and friction points this paper is trying to address.

Extend risk measure theory to continual reinforcement learning

Introduce ergodic risk measures for lifelong adaptation

Address incompatibility of classical risk measures with continual RL

Innovation

Methods, ideas, or system contributions that make the work stand out.

Introducing ergodic risk measures for continual RL

Extending risk measure theory to continual learning

Providing empirical validation for ergodic risk measures

🔎 Similar Papers

Policy Gradient Methods for Risk-Sensitive Distributional Reinforcement Learning with Provable Convergence