🤖 AI Summary
This work proposes JANUS, a cognitive architecture designed to address key challenges in conversational human–agent interaction—namely, difficulties in maintaining context, interpreting ambiguous requests, and ensuring verifiable responses. JANUS models interaction as a partially observable Markov decision process and integrates an inner speech mechanism with a persistent memory system through a modular controller. It uniquely combines inner speech, hierarchical memory, and explicit policy control to enable judgments of informational sufficiency, readiness for execution, and tool grounding. Semantic retrieval from memory and evidence-bundle constraints ensure responses are both faithful and auditable. Evaluated in a knowledge graph–driven dietary assistance scenario, module-level testing demonstrates high reference consistency and practical latency, validating the efficacy of hierarchical reasoning in long-term collaborative tasks.
📝 Abstract
Dialogue-based human-robot interaction requires robot cognitive assistants to maintain persistent user context, recover from underspecified requests, and ground responses in external evidence, while keeping intermediate decisions verifiable. In this paper we introduce JANUS, a cognitive architecture for assistive robots that models interaction as a partially observable Markov decision process and realizes control as a factored controller with typed interfaces. To this aim, Janus (i) decomposes the overall behavior into specialized modules, related to scope detection, intent recognition, memory, inner speech, query generation, and outer speech, and (ii) exposes explicit policies for information sufficiency, execution readiness, and tool grounding. A dedicated memory agent maintains a bounded recent-history buffer, a compact core memory, and an archival store with semantic retrieval, coupled through controlled consolidation and revision policies. Models inspired by the notion of inner speech in cognitive theories provide a control-oriented internal textual flow that validates parameter completeness and triggers clarification before grounding, while a faithfulness constraint ties robot-to-human claims to an evidence bundle combining working context and retrieved tool outputs. We evaluate JANUS through module-level unit tests in a dietary assistance domain grounded on a knowledge graph, reporting high agreement with curated references and practical latency profiles. These results support factored reasoning as a promising path to scalable, auditable, and evidence-grounded robot assistance over extended interaction horizons.