🤖 AI Summary
To address task ambiguity arising from semantic vagueness in natural language instructions within language-conditioned reinforcement learning, this paper proposes a distributed semantic alignment framework. Methodologically: (1) it establishes a distributional policy theory that explicitly models inter-task discriminability; (2) it designs a semantic alignment module that achieves fine-grained matching between language instructions and behavioral trajectories via value distribution estimation and cross-modal embedding alignment; and (3) it jointly optimizes the language model and policy network to enable end-to-end instruction-to-behavior mapping. Evaluated on both structured and visual observation benchmarks, our approach significantly outperforms existing baselines, effectively mitigating instruction ambiguity and improving generalization across tasks. The implementation is publicly available.
📝 Abstract
Comprehending natural language and following human instructions are critical capabilities for intelligent agents. However, the flexibility of linguistic instructions induces substantial ambiguity across language-conditioned tasks, severely degrading algorithmic performance. To address these limitations, we present a novel method named DAIL (Distributional Aligned Learning), featuring two key components: distributional policy and semantic alignment. Specifically, we provide theoretical results that the value distribution estimation mechanism enhances task differentiability. Meanwhile, the semantic alignment module captures the correspondence between trajectories and linguistic instructions. Extensive experimental results on both structured and visual observation benchmarks demonstrate that DAIL effectively resolves instruction ambiguities, achieving superior performance to baseline methods. Our implementation is available at https://github.com/RunpengXie/Distributional-Aligned-Learning.