SparScene: Efficient Traffic Scene Representation via Sparse Graph Learning for Large-Scale Trajectory Generation

📅 2025-12-24

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

To address high redundancy, low computational efficiency, and poor scalability in multi-agent trajectory prediction for large-scale traffic scenarios, this paper proposes a lane-graph topology–aware sparse connectivity mechanism. Instead of conventional distance-threshold–based dense graph construction, our method leverages lane-topology priors to build semantically rich, highly sparse graphs with drastically reduced edge counts. We further design a lightweight graph encoder and a hierarchical interaction aggregation module—operating jointly over agent–map and agent–agent relations—to enable efficient representation learning. Evaluated on the Waymo Open Motion Dataset, our approach processes over 200 agents per frame in just 5 ms; joint inference over 5,000+ agents and 17,000+ lane segments takes only 54 ms, with memory consumption of merely 2.9 GB. The method achieves significant improvements in prediction accuracy, inference speed, and scalability—particularly under large-scale, real-world traffic conditions.

Technology Category

Application Category

📝 Abstract

Multi-agent trajectory generation is a core problem for autonomous driving and intelligent transportation systems. However, efficiently modeling the dynamic interactions between numerous road users and infrastructures in complex scenes remains an open problem. Existing methods typically employ distance-based or fully connected dense graph structures to capture interaction information, which not only introduces a large number of redundant edges but also requires complex and heavily parameterized networks for encoding, thereby resulting in low training and inference efficiency, limiting scalability to large and complex traffic scenes. To overcome the limitations of existing methods, we propose SparScene, a sparse graph learning framework designed for efficient and scalable traffic scene representation. Instead of relying on distance thresholds, SparScene leverages the lane graph topology to construct structure-aware sparse connections between agents and lanes, enabling efficient yet informative scene graph representation. SparScene adopts a lightweight graph encoder that efficiently aggregates agent-map and agent-agent interactions, yielding compact scene representations with substantially improved efficiency and scalability. On the motion prediction benchmark of the Waymo Open Motion Dataset (WOMD), SparScene achieves competitive performance with remarkable efficiency. It generates trajectories for more than 200 agents in a scene within 5 ms and scales to more than 5,000 agents and 17,000 lanes with merely 54 ms of inference time with a GPU memory of 2.9 GB, highlighting its superior scalability for large-scale traffic scenes.

Problem

Research questions and friction points this paper is trying to address.

Efficiently modeling dynamic interactions in complex traffic scenes.

Reducing redundant edges and heavy parameterization in existing methods.

Enabling scalable trajectory generation for large-scale traffic scenarios.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Sparse graph learning for efficient traffic scene representation

Lane graph topology constructs structure-aware sparse connections

Lightweight graph encoder aggregates interactions with improved scalability

🔎 Similar Papers

No similar papers found.

Authors to Follow