ViTE: Virtual Graph Trajectory Expert Router for Pedestrian Trajectory Prediction

📅 2025-11-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Pedestrian trajectory prediction faces an inherent trade-off between GNN depth and modeling capacity: shallow architectures suffer from limited receptive fields (under-reaching), while deep stacks incur prohibitive computational overhead. To address this, we propose VGRNet, which dynamically introduces virtual nodes via a virtual graph structure—replacing deep layer stacking to explicitly model one-hop interactions while implicitly capturing higher-order dependencies. Furthermore, we design a scene-aware routing mechanism based on Mixture of Experts (MoE) that adaptively selects interaction patterns according to contextual cues. By jointly encoding social attention and trajectory features, VGRNet enables efficient, multimodal interaction modeling. Extensive experiments demonstrate state-of-the-art performance on ETH/UCY, NBA, and SDD benchmarks, achieving significant improvements in both prediction accuracy and inference efficiency. Ablation studies confirm the effectiveness and scalability of our approach.

Technology Category

Application Category

📝 Abstract
Pedestrian trajectory prediction is critical for ensuring safety in autonomous driving, surveillance systems, and urban planning applications. While early approaches primarily focus on one-hop pairwise relationships, recent studies attempt to capture high-order interactions by stacking multiple Graph Neural Network (GNN) layers. However, these approaches face a fundamental trade-off: insufficient layers may lead to under-reaching problems that limit the model's receptive field, while excessive depth can result in prohibitive computational costs. We argue that an effective model should be capable of adaptively modeling both explicit one-hop interactions and implicit high-order dependencies, rather than relying solely on architectural depth. To this end, we propose ViTE (Virtual graph Trajectory Expert router), a novel framework for pedestrian trajectory prediction. ViTE consists of two key modules: a Virtual Graph that introduces dynamic virtual nodes to model long-range and high-order interactions without deep GNN stacks, and an Expert Router that adaptively selects interaction experts based on social context using a Mixture-of-Experts design. This combination enables flexible and scalable reasoning across varying interaction patterns. Experiments on three benchmarks (ETH/UCY, NBA, and SDD) demonstrate that our method consistently achieves state-of-the-art performance, validating both its effectiveness and practical efficiency.
Problem

Research questions and friction points this paper is trying to address.

Modeling high-order pedestrian interactions without deep GNN stacks
Balancing computational cost and receptive field in trajectory prediction
Adaptively capturing both explicit and implicit social dependencies
Innovation

Methods, ideas, or system contributions that make the work stand out.

Virtual Graph models long-range interactions without deep GNN stacks
Expert Router adaptively selects experts using Mixture-of-Experts design
Combination enables flexible reasoning across varying interaction patterns
🔎 Similar Papers
No similar papers found.