🤖 AI Summary
Existing proactive agents struggle to accurately identify and respond to users’ latent needs in real-world scenarios characterized by complexity, ambiguity, real-time dynamics, and long-term horizons. To address this challenge, this work proposes the DD-MM-PAS framework and implements it in the PASK agent system, which integrates a streaming intent recognition model (IntentFlow), a hybrid memory architecture spanning workspace, user, and global levels, and a low-latency closed-loop proactive decision-making mechanism. Furthermore, the authors introduce LatentNeeds-Bench, a benchmark constructed from real user data for evaluating latent need responsiveness. Experimental results demonstrate that IntentFlow achieves performance on par with Gemini3-Flash under strict latency constraints while more effectively capturing nuanced, deep-level user intentions, thereby validating the efficacy and advancement of the proposed system in authentic real-world settings.
📝 Abstract
Proactivity is a core expectation for AGI. Prior work remains largely confined to laboratory settings, leaving a clear gap in real-world proactive agent: depth, complexity, ambiguity, precision and real-time constraints. We study this setting, where useful intervention requires inferring latent needs from ongoing context and grounding actions in evolving user memory under latency and long-horizon constraints. We first propose DD-MM-PAS (Demand Detection, Memory Modeling, Proactive Agent System) as a general paradigm for streaming proactive AI agent. We instantiate this paradigm in Pask, with streaming IntentFlow model for DD, a hybrid memory (workspace, user, global) for long-term MM, PAS infra framework and introduce how these components form a closed loop. We also introduce LatentNeeds-Bench, a real-world benchmark built from user-consented data and refined through thousands of rounds of human editing. Experiments show that IntentFlow matches leading Gemini3-Flash models under latency constraints, while identifying deeper user intent.