🤖 AI Summary
This work addresses the challenge of efficiently localizing odor sources in turbulent environments by transforming sparse olfactory observations into robust navigation strategies. The authors propose a tabular Q-learning–based reinforcement learning approach that employs an extremely minimal state representation—namely, the time elapsed since the last odor detection—as a clock-like cue. This formulation enables the agent to learn an interpretable policy that effectively integrates hallmark insect-inspired behaviors, including surging, casting, and upwind returns. Evaluated on odor plumes generated via direct numerical simulation of turbulence, the method not only reproduces biologically plausible search patterns but also significantly enhances adaptability to odor intermittency and overall robustness through increased policy flexibility.
📝 Abstract
Finding an odor source in a turbulent flow requires effectively leveraging the history of olfactory observations into a robust navigation strategy. In this work, we use tabular Q-learning to train an olfactory search agent with a minimal memory of past observations: only a running clock since the last whiff. This agent learns an interpretable strategy to recover the plume which combines well-known behaviors observed in insects: surging, casting, and a return downwind. While achieving good performance on data from direct numerical simulations of turbulence, the agent is limited by an inability to adapt its strategy to the local intermittency level; we show that providing more flexibility improves robustness.