STAR: Mitigating Cascading Errors in Spatial Reasoning via Turn-point Alignment and Segment-level DPO

📅 2026-04-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of cascading errors in large language models when performing structured spatial navigation within complex topologies. To mitigate this issue, the authors propose STAR, a two-stage framework: the first stage employs supervised fine-tuning to internalize spatial semantics and prune redundant paths, while the second introduces segment-level Direct Preference Optimization with spatial awareness (SDPO) to enable self-correction during long-horizon navigation. The study innovatively constructs RedMaze-23K, a dataset annotated with human-inspired turning points, and for the first time integrates turning-point alignment with segment-level DPO to enhance spatial reasoning. Experimental results demonstrate that STAR-32B achieves state-of-the-art performance among open-source models with an accuracy of 29.27%, surpassing DeepSeek-V3 and attaining 82.4% of GPT-4’s performance.
📝 Abstract
Structured spatial navigation is a core benchmark for Large Language Models (LLMs) spatial reasoning. Existing paradigms like Visualization-of-Thought (VoT) are prone to cascading errors in complex topologies. To solve this, we propose STAR, a two-stage framework grounded on topological anchors, and introduce the RedMaze-23K dataset with human-inspired turnpoint annotations. The first stage uses supervised fine-tuning to help models internalize spatial semantics and prune redundant paths. The second adopts Spatial-aware Segment-level Direct Preference Optimization (SDPO) to refine self-correction in long-horizon navigation. Experiments show STAR achieves state-of-the-art performance among open-source models: its 32B variant outperforms DeepSeek-V3 (29.27% vs. 25.00%) and reaches 82.4% of GPT-4's performance.
Problem

Research questions and friction points this paper is trying to address.

cascading errors
spatial reasoning
structured spatial navigation
complex topologies
Innovation

Methods, ideas, or system contributions that make the work stand out.

Turn-point Alignment
Segment-level DPO
Spatial Reasoning
Cascading Error Mitigation
Topological Anchors
🔎 Similar Papers
No similar papers found.
P
Pukun Zhao
Guangdong University of Finance and Economics
Longxiang Wang
Longxiang Wang
PhD student, City University of Hong Kong
Large language modelEncrypted database
C
Chen Chen
Guangdong University of Finance and Economics
P
Peicheng Wang
Guangdong University of Finance and Economics
F
Fanqing Zhou
Guangdong University of Finance and Economics
R
Runze Li
Westlake University
H
Haojian Huang
The University of Hong Kong