SEER-VAR: Semantic Egocentric Environment Reasoner for Vehicle Augmented Reality

📅 2025-08-24

📈 Citations: 0

✨ Influential: 0

career value

216K/year

🤖 AI Summary

Current in-vehicle AR systems struggle with dynamic cabin–road scene separation, lack environment-adaptive spatial alignment and perception-consistent rendering, and feature neither LLM-driven context-aware recommendation nor a real-world driving SLAM evaluation benchmark. This paper proposes the first semantic dynamic cabin–road separation framework, integrating depth-guided vision–language grounding for cross-modal alignment. We design a dual-branch context-aware SLAM architecture enabling robust 6DoF tracking. We introduce EgoSLAM-Drive—the first real-world, first-person in-vehicle AR dataset—alongside the first GPT-driven AR content recommendation module for driving scenarios. Experiments demonstrate significant improvements in spatial alignment accuracy, AR rendering consistency, user scene comprehension, prompt relevance, and driving comfort across diverse driving conditions.

Technology Category

Application Category

📝 Abstract

We present SEER-VAR, a novel framework for egocentric vehicle-based augmented reality (AR) that unifies semantic decomposition, Context-Aware SLAM Branches (CASB), and LLM-driven recommendation. Unlike existing systems that assume static or single-view settings, SEER-VAR dynamically separates cabin and road scenes via depth-guided vision-language grounding. Two SLAM branches track egocentric motion in each context, while a GPT-based module generates context-aware overlays such as dashboard cues and hazard alerts. To support evaluation, we introduce EgoSLAM-Drive, a real-world dataset featuring synchronized egocentric views, 6DoF ground-truth poses, and AR annotations across diverse driving scenarios. Experiments demonstrate that SEER-VAR achieves robust spatial alignment and perceptually coherent AR rendering across varied environments. As one of the first to explore LLM-based AR recommendation in egocentric driving, we address the lack of comparable systems through structured prompting and detailed user studies. Results show that SEER-VAR enhances perceived scene understanding, overlay relevance, and driver ease, providing an effective foundation for future research in this direction. Code and dataset will be made open source.

Problem

Research questions and friction points this paper is trying to address.

Dynamic separation of cabin and road scenes

Robust spatial alignment for AR rendering

LLM-driven context-aware overlay generation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Depth-guided semantic decomposition for scene separation

Dual SLAM branches for contextual motion tracking

GPT-based module for context-aware AR overlays

🔎 Similar Papers

No similar papers found.