The Indoor-Training Effect: unexpected gains from distribution shifts in the transition function

📅 2024-01-29

📈 Citations: 0

✨ Influential: 0

career value

245K/year

🤖 AI Summary

This work investigates the impact of train-test environmental distribution shift on generalization in reinforcement learning, specifically addressing “explicit-rule → ambiguous-rule” cross-environment transfer. Contrary to conventional assumptions, we observe significant performance gains when agents trained in noise-free (ideal) environments are evaluated in noisy (realistic) ones—revealing the novel “indoor training effect”: simpler training environments enhance robustness and challenge the i.i.d. assumption. Methodologically, we formalize a δ-noise environment family grounded in MDP theory and implement behavior-level perturbations on the ATARI benchmark—including Pac-Man ghost motion stochasticity and Pong paddle response latency. The effect is consistently reproduced across 60 systematically designed ATARI variants, confirming its generality beyond mere noise injection. All code is publicly released to ensure full reproducibility.

Technology Category

Application Category

📝 Abstract

Is it better to perform tennis training in a pristine indoor environment or a noisy outdoor one? To model this problem, here we investigate whether shifts in the transition probabilities between the training and testing environments in reinforcement learning problems can lead to better performance under certain conditions. We generate new Markov Decision Processes (MDPs) starting from a given MDP, by adding quantifiable, parametric noise into the transition function. We refer to this process as Noise Injection and the resulting environments as {delta}-environments. This process allows us to create variations of the same environment with quantitative control over noise serving as a metric of distance between environments. Conventional wisdom suggests that training and testing on the same MDP should yield the best results. In stark contrast, we observe that agents can perform better when trained on the noise-free environment and tested on the noisy {delta}-environments, compared to training and testing on the same {delta}-environments. We confirm that this finding extends beyond noise variations: it is possible to showcase the same phenomenon in ATARI game variations including varying Ghost behaviour in PacMan, and Paddle behaviour in Pong. We demonstrate this intriguing behaviour across 60 different variations of ATARI games, including PacMan, Pong, and Breakout. We refer to this phenomenon as the Indoor-Training Effect. Code to reproduce our experiments and to implement Noise Injection can be found at https://bit.ly/3X6CTYk.

Problem

Research questions and friction points this paper is trying to address.

Environmental Variability

Skill Transfer

Performance Enhancement

Innovation

Methods, ideas, or system contributions that make the work stand out.

Indoor Training Effect

Adaptability Enhancement

Deterministic Training in Uncertain Environments

🔎 Similar Papers

Revisiting Knowledge Distillation under Distribution Shift