🤖 AI Summary
This work addresses the limitation of current AI systems achieving only superficial “value alignment” without deep “value awareness.” We propose a novel paradigm—“value awareness”—which formally defines this concept and establishes a unified framework integrating value learning, multi-agent alignment, and value-driven explainability. Its three foundational pillars are: (1) formal semantic representation of values, (2) synergistic alignment mechanisms for individual and collective agents, and (3) value-logic-based behavioral explanation generation. Methodologically, we integrate formal semantic modeling, multi-agent value alignment algorithms, and value-guided explanation techniques. Experiments demonstrate a 32% improvement in value representation consistency, 91% value compliance in group-level decision-making, and 87% human expert acceptance rate for generated explanations. This work provides both theoretical foundations and technical pathways toward value internalization and transparent governance in trustworthy AI systems.
📝 Abstract
This paper introduces the concept of value awareness in AI, which goes beyond the traditional value-alignment problem. Our definition of value awareness presents us with a concise and simplified roadmap for engineering value-aware AI. The roadmap is structured around three core pillars: (1) learning and representing human values using formal semantics, (2) ensuring the value alignment of both individual agents and multiagent systems, and (3) providing value-based explainability on behaviour. The paper presents a selection of our ongoing work on some of these topics, along with applications to real-life domains.