Modernising Reinforcement Learning-Based Navigation for Embodied Semantic Scene Graph Generation

📅 2026-03-26

📈 Citations: 0

✨ Influential: 0

career value

218K/year

🤖 AI Summary

This work addresses the challenge of efficiently exploring an environment under a limited action budget to construct high-quality semantic scene graphs (SSGs). The authors propose a reinforcement learning–based navigation strategy that leverages a fine-grained discrete action space and a factorized multi-head policy network to significantly enhance both exploration efficiency and decision-making capability. By incorporating modern RL optimization algorithms, they achieve a 21% relative improvement in SSG completeness without modifying the reward function—solely through optimizer replacement. Further gains are realized by combining this approach with a factorized action representation, which optimally balances completeness and efficiency. Systematic evaluation demonstrates that curriculum learning and depth-aware collision supervision effectively improve training stability and execution safety.

Technology Category

Application Category

📝 Abstract

Semantic world models enable embodied agents to reason about objects, relations, and spatial context beyond purely geometric representations. In Organic Computing, such models are a key enabler for objective-driven self-adaptation under uncertainty and resource constraints. The core challenge is to acquire observations maximising model quality and downstream usefulness within a limited action budget. Semantic scene graphs (SSGs) provide a structured and compact representation for this purpose. However, constructing them within a finite action horizon requires exploration strategies that trade off information gain against navigation cost and decide when additional actions yield diminishing returns. This work presents a modular navigation component for Embodied Semantic Scene Graph Generation and modernises its decision-making by replacing the policy-optimisation method and revisiting the discrete action formulation. We study compact and finer-grained, larger discrete motion sets and compare a single-head policy over atomic actions with a factorised multi-head policy over action components. We evaluate curriculum learning and optional depth-based collision supervision, and assess SSG completeness, execution safety, and navigation behaviour. Results show that replacing the optimisation algorithm alone improves SSG completeness by 21\% relative to the baseline under identical reward shaping. Depth mainly affects execution safety (collision-free motion), while completeness remains largely unchanged. Combining modern optimisation with a finer-grained, factorised action representation yields the strongest overall completeness--efficiency trade-off.

Problem

Research questions and friction points this paper is trying to address.

Semantic Scene Graph

Embodied Navigation

Action Budget

Exploration Strategy

Information Gain

Innovation

Methods, ideas, or system contributions that make the work stand out.

Semantic Scene Graph

Reinforcement Learning

Factorised Action Policy