Risk-Averse Learning with Varying Risk Levels

📅 2025-12-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper studies safety-critical online optimization under time-varying risk preferences and non-stationary environments, using Conditional Value-at-Risk (CVaR) as the risk measure. To capture the dynamics of risk preference, we introduce the novel notion of *risk-level drift*. We develop a unified dynamic regret framework applicable to both first-order gradient and zero-order stochastic feedback settings. Our algorithm jointly accounts for function variation and risk-level drift, yielding a tight dynamic regret bound. Theoretically, we establish its adaptivity to both environmental non-stationarity and risk-sensitivity. Numerical experiments demonstrate its effectiveness in simultaneously handling risk aversion and environmental shifts.

Technology Category

Application Category

📝 Abstract
In safety-critical decision-making, the environment may evolve over time, and the learner adjusts its risk level accordingly. This work investigates risk-averse online optimization in dynamic environments with varying risk levels, employing Conditional Value-at-Risk (CVaR) as the risk measure. To capture the dynamics of the environment and risk levels, we employ the function variation metric and introduce a novel risk-level variation metric. Two information settings are considered: a first-order scenario, where the learner observes both function values and their gradients; and a zeroth-order scenario, where only function evaluations are available. For both cases, we develop risk-averse learning algorithms with a limited sampling budget and analyze their dynamic regret bounds in terms of function variation, risk-level variation, and the total number of samples. The regret analysis demonstrates the adaptability of the algorithms in non-stationary and risk-sensitive settings. Finally, numerical experiments are presented to demonstrate the efficacy of the methods.
Problem

Research questions and friction points this paper is trying to address.

Develop risk-averse online optimization algorithms for dynamic environments with varying risk levels.
Analyze dynamic regret bounds using function variation and risk-level variation metrics.
Design algorithms for both first-order and zeroth-order information settings with limited sampling.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses Conditional Value-at-Risk for risk measurement
Introduces risk-level variation metric for dynamic environments
Develops algorithms with limited sampling for both gradient and function-only settings
🔎 Similar Papers
No similar papers found.
S
Siyi Wang
Division of Decision and Control Systems, School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, 10044 Stockholm, Sweden
Z
Zifan Wang
Division of Decision and Control Systems, School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, 10044 Stockholm, Sweden
Karl H. Johansson
Karl H. Johansson
EECS and Digital Futures, KTH Royal Institute of Technology, Sweden
Control theoryCyber-physical systemsNetworked controlHybrid systemsMachine learning