The Minimal Search Space for Conditional Causal Bandits

📅 2025-02-10

📈 Citations: 0

✨ Influential: 0

career value

235K/year

🤖 AI Summary

This paper addresses the identification of the minimal sufficient intervention set in conditional causal bandits: given a causal graph, how to determine the smallest subset of nodes such that conditioning interventions on their observed values achieve optimal expected reward. We first derive a graph-theoretic characterization of nodes necessarily included in any optimal conditional intervention. Based on this characterization, we design an exact linear-time algorithm with time complexity O(|V| + |E|). Our method integrates causal graph reasoning, conditional intervention modeling, and efficient graph traversal, circumventing exhaustive search and provably minimizing the bandit action space dimension. Experiments demonstrate that our algorithm significantly accelerates convergence of standard bandit policies, and its effectiveness and scalability are validated across multiple benchmark causal graphs.

Technology Category

Application Category

📝 Abstract

Causal knowledge can be used to support decision-making problems. This has been recognized in the causal bandits literature, where a causal (multi-armed) bandit is characterized by a causal graphical model and a target variable. The arms are then interventions on the causal model, and rewards are samples of the target variable. Causal bandits were originally studied with a focus on hard interventions. We focus instead on cases where the arms are conditional interventions, which more accurately model many real-world decision-making problems by allowing the value of the intervened variable to be chosen based on the observed values of other variables. This paper presents a graphical characterization of the minimal set of nodes guaranteed to contain the optimal conditional intervention, which maximizes the expected reward. We then propose an efficient algorithm with a time complexity of $O(|V| + |E|)$ to identify this minimal set of nodes. We prove that the graphical characterization and the proposed algorithm are correct. Finally, we empirically demonstrate that our algorithm significantly prunes the search space and substantially accelerates convergence rates when integrated into standard multi-armed bandit algorithms.

Problem

Research questions and friction points this paper is trying to address.

Identifies minimal node set for optimal conditional interventions

Proposes efficient algorithm for conditional causal bandits

Accelerates convergence by pruning search space

Innovation

Methods, ideas, or system contributions that make the work stand out.

Conditional interventions in causal bandits

Graphical minimal node set identification

Efficient algorithm with O(|V| + |E|) complexity

🔎 Similar Papers

No similar papers found.

Bosch Group

Renningen, BW, DE

2026 Fall Applied Science Internship - Reinforcement Learning & Optimization (Machine Learning) - United States, PhD Student Science Recruiting

Amazon

Arlington, VA, USA / Bellevue, WA, USA / Boston, MA, USA

AI Research Scientist, CoreML - Monetization AI