Adaptive Visual Navigation Assistant in 3D RPGs

๐Ÿ“… 2025-08-25
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses the novel task of spatial transition point (STP) detection and main spatial transition point (MSTP) identification from single-frame 3D game images. We propose the first end-to-end two-stage deep learning framework: Stage I employs a Faster R-CNNโ€“based detector to localize traversable STPs; Stage II introduces a lightweight ranking network that fuses local and global visual features, augmented with parameter-efficient adapters and retrieval-enhanced mechanisms, to precisely identify the unique MSTP leading to the playerโ€™s current macro-goal. Evaluated on five action RPG game datasets, our method significantly improves robustness in low-resource settings for MSTP selection. We establish the first benchmark for this task and introduce a new paradigm for intelligent map construction and navigation assistance in 3D game environments.

Technology Category

Application Category

๐Ÿ“ Abstract
In complex 3D game environments, players rely on visual affordances to spot map transition points. Efficient identification of such points is important to client-side auto-mapping, and provides an objective basis for evaluating map cue presentation. In this work, we formalize the task of detecting traversable Spatial Transition Points (STPs)-connectors between two sub regions-and selecting the singular Main STP (MSTP), the unique STP that lies on the designer-intended critical path toward the player's current macro-objective, from a single game frame, proposing this as a new research focus. We introduce a two-stage deep-learning pipeline that first detects potential STPs using Faster R-CNN and then ranks them with a lightweight MSTP selector that fuses local and global visual features. Both stages benefit from parameter-efficient adapters, and we further introduce an optional retrieval-augmented fusion step. Our primary goal is to establish the feasibility of this problem and set baseline performance metrics. We validate our approach on a custom-built, diverse dataset collected from five Action RPG titles. Our experiments reveal a key trade-off: while full-network fine-tuning produces superior STP detection with sufficient data, adapter-only transfer is significantly more robust and effective in low-data scenarios and for the MSTP selection task. By defining this novel problem, providing a baseline pipeline and dataset, and offering initial insights into efficient model adaptation, we aim to contribute to future AI-driven navigation aids and data-informed level-design tools.
Problem

Research questions and friction points this paper is trying to address.

Detecting traversable spatial transition points in 3D RPG game environments
Identifying the main transition point on critical path from single frame
Establishing feasibility and baseline metrics for AI navigation assistance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Two-stage deep-learning pipeline for STP detection
Fuses local and global visual features with adapters
Retrieval-augmented fusion step for enhanced performance
๐Ÿ”Ž Similar Papers
No similar papers found.
Kaijie Xu
Kaijie Xu
Xidian University
C
Clark Verbrugge
Department of Computer Science, McGill University