Context-Sensitive Abstractions for Reinforcement Learning with Parameterized Actions

📅 2025-12-23

📈 Citations: 0

✨ Influential: 0

career value

215K/year

🤖 AI Summary

Real-world sequential decision-making often involves parametric action spaces—comprising discrete action types coupled with continuous execution parameters—yet existing reinforcement learning methods either require manual modeling, fail to jointly capture the hybrid action structure, or rely on domain-specific priors. Method: We propose the first online RL framework that jointly learns state and action abstractions, unifying hybrid action spaces via context-sensitive abstraction, TD(λ) learning, and adaptive granularity control. Contribution/Results: Our approach requires no handcrafted models or prior knowledge, dynamically refining representation resolution in task-critical regions. It significantly improves sample efficiency in long-horizon, sparse-reward settings. Empirically, it outperforms current state-of-the-art methods across multiple continuous-state, parametric-action benchmark tasks.

Technology Category

Application Category

📝 Abstract

Real-world sequential decision-making often involves parameterized action spaces that require both, decisions regarding discrete actions and decisions about continuous action parameters governing how an action is executed. Existing approaches exhibit severe limitations in this setting -- planning methods demand hand-crafted action models, and standard reinforcement learning (RL) algorithms are designed for either discrete or continuous actions but not both, and the few RL methods that handle parameterized actions typically rely on domain-specific engineering and fail to exploit the latent structure of these spaces. This paper extends the scope of RL algorithms to long-horizon, sparse-reward settings with parameterized actions by enabling agents to autonomously learn both state and action abstractions online. We introduce algorithms that progressively refine these abstractions during learning, increasing fine-grained detail in the critical regions of the state-action space where greater resolution improves performance. Across several continuous-state, parameterized-action domains, our abstraction-driven approach enables TD($λ$) to achieve markedly higher sample efficiency than state-of-the-art baselines.

Problem

Research questions and friction points this paper is trying to address.

Develop RL algorithms for parameterized action spaces

Enable autonomous learning of state and action abstractions

Improve sample efficiency in sparse-reward, long-horizon settings

Innovation

Methods, ideas, or system contributions that make the work stand out.

Autonomous learning of state and action abstractions online

Progressive refinement of abstractions in critical regions

Enhancing TD(λ) sample efficiency in parameterized-action domains

🔎 Similar Papers

On the Sample Efficiency of Abstractions and Potential-Based Reward Shaping in Reinforcement Learning