🤖 AI Summary
Traditional stateless function calls (FC) in intelligent cockpits suffer from low efficiency, weak error recovery, and require repeated probing to model the environment. To address these issues, this paper proposes State-based Function Call (SFC), a state-aware method enabling explicit system state modeling and direct state transitions. We introduce the first highly integrated vehicular multi-device simulation environment—comprising 30 modules, 250 APIs, and 680 attributes—that supports real-time state feedback and quantitative evaluation of agent behavior. Leveraging an executable simulation architecture, fine-grained state tracking, and a systematic testing framework, SFC significantly improves execution accuracy and reduces latency on complex multimodal tasks. The source code and evaluation platform are publicly released, establishing a standardized benchmark for automotive agent research.
📝 Abstract
Intelligent vehicle cockpits present unique challenges for API Agents, requiring coordination across tightly-coupled subsystems that exceed typical task environments' complexity. Traditional Function Calling (FC) approaches operate statelessly, requiring multiple exploratory calls to build environmental awareness before execution, leading to inefficiency and limited error recovery. We introduce VehicleWorld, the first comprehensive environment for the automotive domain, featuring 30 modules, 250 APIs, and 680 properties with fully executable implementations that provide real-time state information during agent execution. This environment enables precise evaluation of vehicle agent behaviors across diverse, challenging scenarios. Through systematic analysis, we discovered that direct state prediction outperforms function calling for environmental control. Building on this insight, we propose State-based Function Call (SFC), a novel approach that maintains explicit system state awareness and implements direct state transitions to achieve target conditions. Experimental results demonstrate that SFC significantly outperforms traditional FC approaches, achieving superior execution accuracy and reduced latency. We have made all implementation code publicly available on Github https://github.com/OpenMOSS/VehicleWorld.