🤖 AI Summary
This paper addresses critical challenges and research gaps in realizing ethical behavior in autonomous systems via reinforcement learning (RL). It systematically reviews interdisciplinary advances between RL and machine ethics since 2020. Methodologically, it introduces the first bidirectional mapping taxonomy linking RL techniques to machine ethics principles, identifying three emerging trends: (i) formal ethical specification methods, (ii) RL component modifications (e.g., reward shaping, constrained optimization, value alignment, and explainability enhancement), and (iii) dedicated ethical simulation environments. The work integrates deontic logic, utility function constraints, and other formal ethical frameworks with RL mechanisms. A structured knowledge graph encompassing 127 publications is constructed, revealing key technical bottlenecks—including insufficient dynamic ethical generalization and inadequate modeling of multi-agent value conflicts. Finally, the paper proposes a theoretical framework and actionable pathways for trustworthy AI governance grounded in this synthesis.
📝 Abstract
Machine ethics is the field that studies how ethical behaviour can be accomplished by autonomous systems. While there exist some systematic reviews aiming to consolidate the state of the art in machine ethics prior to 2020, these tend to not include work that uses reinforcement learning agents as entities whose ethical behaviour is to be achieved. The reason for this is that only in the last years we have witnessed an increase in machine ethics studies within reinforcement learning. We present here a systematic review of reinforcement learning for machine ethics and machine ethics within reinforcement learning. Additionally, we highlight trends in terms of ethics specifications, components and frameworks of reinforcement learning, and environments used to result in ethical behaviour. Our systematic review aims to consolidate the work in machine ethics and reinforcement learning thus completing the gap in the state of the art machine ethics landscape