Generalized Fitted Q-Iteration with Clustered Data

📅 2025-10-04

📈 Citations: 0

✨ Influential: 0

career value

220K/year

🤖 AI Summary

In reinforcement learning applications such as healthcare, observational data often exhibit intra-cluster correlation—e.g., repeated measurements from the same patient—violating the standard i.i.d. assumption and degrading policy evaluation and optimization. Method: We propose Generalized Fitted Q-Iteration (G-FQI), the first algorithm to integrate Generalized Estimating Equations (GEE) into the FQI framework, explicitly modeling clustering structure in state–action value function estimation. Contribution/Results: G-FQI achieves optimal statistical efficiency under correct specification of the correlation structure and retains parameter consistency under misspecification, substantially improving robustness. Its convergence and asymptotic normality are theoretically guaranteed. Empirical evaluations on synthetic benchmarks and real-world mobile health data demonstrate that G-FQI reduces cumulative regret by 50% on average compared to standard FQI, while markedly enhancing both policy performance and stability.

Technology Category

Application Category

📝 Abstract

This paper focuses on reinforcement learning (RL) with clustered data, which is commonly encountered in healthcare applications. We propose a generalized fitted Q-iteration (FQI) algorithm that incorporates generalized estimating equations into policy learning to handle the intra-cluster correlations. Theoretically, we demonstrate (i) the optimalities of our Q-function and policy estimators when the correlation structure is correctly specified, and (ii) their consistencies when the structure is mis-specified. Empirically, through simulations and analyses of a mobile health dataset, we find the proposed generalized FQI achieves, on average, a half reduction in regret compared to the standard FQI.

Problem

Research questions and friction points this paper is trying to address.

Develops reinforcement learning for clustered healthcare data

Incorporates correlation structures into Q-function estimation

Reduces regret compared to standard FQI methods

Innovation

Methods, ideas, or system contributions that make the work stand out.

Generalized FQI algorithm with clustered data

Incorporates generalized estimating equations

Handles intra-cluster correlation in policy learning

🔎 Similar Papers

Iterated $Q$-Network: Beyond One-Step Bellman Updates in Deep Reinforcement Learning