Using causal abstractions to accelerate decision-making in complex bandit problems

📅 2025-09-04

📈 Citations: 0

✨ Influential: 0

career value

235K/year

🤖 AI Summary

Real-world complex causal decision-making problems often require modeling across multiple abstraction levels, yet existing Causal Multi-Armed Bandit (CMAB) methods struggle to jointly leverage information and computational resources across these levels. To address this, we propose AT-UCB—the first CMAB algorithm integrating causal abstraction theory—where efficient exploration in a low-cost, coarse-grained causal model dynamically prunes the candidate action set for a high-fidelity, fine-grained model, enabling cross-level information transfer and policy acceleration. AT-UCB synergistically combines multi-resolution simulation environments with an upper-confidence-bound (UCB) mechanism and derives a provably tighter cumulative regret bound. Extensive experiments on an epidemiological simulation platform demonstrate that AT-UCB significantly outperforms standard UCB and state-of-the-art CMAB baselines in both sample efficiency and regret minimization.

Technology Category

Application Category

📝 Abstract

Although real-world decision-making problems can often be encoded as causal multi-armed bandits (CMABs) at different levels of abstraction, a general methodology exploiting the information and computational advantages of each abstraction level is missing. In this paper, we propose AT-UCB, an algorithm which efficiently exploits shared information between CMAB problem instances defined at different levels of abstraction. More specifically, AT-UCB leverages causal abstraction (CA) theory to explore within a cheap-to-simulate and coarse-grained CMAB instance, before employing the traditional upper confidence bound (UCB) algorithm on a restricted set of potentially optimal actions in the CMAB of interest, leading to significant reductions in cumulative regret when compared to the classical UCB algorithm. We illustrate the advantages of AT-UCB theoretically, through a novel upper bound on the cumulative regret, and empirically, by applying AT-UCB to epidemiological simulators with varying resolution and computational cost.

Problem

Research questions and friction points this paper is trying to address.

Accelerating decision-making in complex causal bandit problems

Exploiting shared information across different abstraction levels

Reducing cumulative regret using causal abstraction theory

Innovation

Methods, ideas, or system contributions that make the work stand out.

Leverages causal abstraction theory for exploration

Combines coarse-grained simulation with UCB algorithm

Reduces cumulative regret through multi-level optimization

🔎 Similar Papers

The Causal Information Bottleneck and Optimal Causal Variable Abstractions