LIEREx: Language-Image Embeddings for Robotic Exploration

📅 2026-01-30

🏛️ KI - Künstliche Intelligenz

📈 Citations: 1

✨ Influential: 0

career value

213K/year

🤖 AI Summary

This work addresses the limitation of traditional semantic mapping, which relies on predefined object categories and struggles to handle unknown objects, thereby hindering goal-directed exploration in partially unknown environments. To overcome this, the authors propose an open-vocabulary semantic mapping approach that integrates vision-language foundation models—such as CLIP—with 3D semantic scene graphs. This method introduces open-vocabulary semantic embeddings into 3D scene graph construction for the first time, effectively bypassing the constraints of fixed taxonomies. By enabling natural language–guided exploration strategies, the framework facilitates robust recognition and semantic reasoning about out-of-distribution target objects, significantly enhancing the robot’s semantic understanding and task generalization capabilities in dynamic and unfamiliar settings.

Technology Category

Application Category

📝 Abstract

Semantic maps allow a robot to reason about its surroundings to fulfill tasks such as navigating known environments, finding specific objects, and exploring unmapped areas. Traditional mapping approaches provide accurate geometric representations but are often constrained by pre-designed symbolic vocabularies. The reliance on fixed object classes makes it impractical to handle out-of-distribution knowledge not defined at design time. Recent advances in Vision-Language Foundation Models, such as CLIP, enable open-set mapping, where objects are encoded as high-dimensional embeddings rather than fixed labels. In LIEREx, we integrate these VLFMs with established 3D Semantic Scene Graphs to enable target-directed exploration by an autonomous agent in partially unknown environments.

Problem

Research questions and friction points this paper is trying to address.

semantic mapping

open-set recognition

robotic exploration

out-of-distribution objects

Vision-Language Foundation Models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Vision-Language Foundation Models

Open-set Semantic Mapping

3D Semantic Scene Graphs