Human-Inspired Pavlovian and Instrumental Learning for Autonomous Agent Navigation

📅 2026-03-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes a hybrid reinforcement learning architecture inspired by human neural mechanisms to address the challenge autonomous agents face in uncertain environments—balancing rapid responsiveness with goal-directed planning. Integrating Pavlovian conditioning, model-free instrumental learning, and model-based reasoning, the framework leverages environmental spatial features as conditioned stimuli to generate intrinsic value signals. A motivation-modulated Bayesian arbitration mechanism dynamically coordinates these strategies based on contextual uncertainty. Experimental results demonstrate that the approach significantly accelerates learning, enhances navigation safety, reduces unproductive exploration in high-uncertainty regions, and enables a smooth transition from exploratory behavior to planning-driven control.

Technology Category

Application Category

📝 Abstract
Autonomous agents operating in uncertain environments must balance fast responses with goal-directed planning. Classical MF RL often converges slowly and may induce unsafe exploration, whereas MB methods are computationally expensive and sensitive to model mismatch. This paper presents a human-inspired hybrid RL architecture integrating Pavlovian, Instrumental MF, and Instrumental MB components. Inspired by Pavlovian and Instrumental learning from neuroscience, the framework considers contextual radio cues, here intended as georeferenced environmental features acting as CS, to shape intrinsic value signals and bias decision-making. Learning is further modulated by internal motivational drives through a dedicated motivational signal. A Bayesian arbitration mechanism adaptively blends MF and MB estimates based on predicted reliability. Simulation results show that the hybrid approach accelerates learning, improves operational safety, and reduces navigation in high-uncertainty regions compared to standard RL baselines. Pavlovian conditioning promotes safer exploration and faster convergence, while arbitration enables a smooth transition from exploration to efficient, plan-driven exploitation. Overall, the results highlight the benefits of biologically inspired modularity for robust and adaptive autonomous systems under uncertainty.
Problem

Research questions and friction points this paper is trying to address.

autonomous agent navigation
uncertain environments
reinforcement learning
model-based
model-free
Innovation

Methods, ideas, or system contributions that make the work stand out.

Pavlovian learning
hybrid reinforcement learning
Bayesian arbitration
motivational modulation
autonomous navigation
J
Jingfeng Shan
University of Bologna, Bologna, Italy
Francesco Guidi
Francesco Guidi
CNR IEIIT
Wireless communicationsRFIDUWBRadar
M
Mehrdad Saeidi
University of Bologna, Bologna, Italy
Enrico Testi
Enrico Testi
Junior Assistant Professor, University of Bologna
Massive Multiple AccessDeep LearningCell-free mMIMOIoTSatellite IoT
Elia Favarelli
Elia Favarelli
Junior Assistant Professor, University of Bologna
Machine LearningAIStructural MonitoringTracking
Andrea Giorgetti
Andrea Giorgetti
University of Bologna
Signal processingintegrated sensing and communicationmachine learningcognitive radio
Davide Dardari
Davide Dardari
University of Bologna
Wireless communication and localization
A
Alberto Zanella
National Research Council of Italy, Bologna, Italy
G
Giorgio Li Pira
University of Bologna, Department of Psychology “Renzo Canestrari”, 47521 Cesena, Italy
F
Francesca Starita
University of Bologna, Department of Psychology “Renzo Canestrari”, 47521 Cesena, Italy
Anna Guerra
Anna Guerra
University of Bologna, UNIBO
LocalizationWireless CommunicationsSignal Processing