LIEREx: Language-Image Embeddings for Robotic Exploration

📅 2026-01-30
🏛️ KI - Künstliche Intelligenz
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitation of traditional semantic mapping, which relies on predefined object categories and struggles to handle unknown objects, thereby hindering goal-directed exploration in partially unknown environments. To overcome this, the authors propose an open-vocabulary semantic mapping approach that integrates vision-language foundation models—such as CLIP—with 3D semantic scene graphs. This method introduces open-vocabulary semantic embeddings into 3D scene graph construction for the first time, effectively bypassing the constraints of fixed taxonomies. By enabling natural language–guided exploration strategies, the framework facilitates robust recognition and semantic reasoning about out-of-distribution target objects, significantly enhancing the robot’s semantic understanding and task generalization capabilities in dynamic and unfamiliar settings.

Technology Category

Application Category

📝 Abstract
Semantic maps allow a robot to reason about its surroundings to fulfill tasks such as navigating known environments, finding specific objects, and exploring unmapped areas. Traditional mapping approaches provide accurate geometric representations but are often constrained by pre-designed symbolic vocabularies. The reliance on fixed object classes makes it impractical to handle out-of-distribution knowledge not defined at design time. Recent advances in Vision-Language Foundation Models, such as CLIP, enable open-set mapping, where objects are encoded as high-dimensional embeddings rather than fixed labels. In LIEREx, we integrate these VLFMs with established 3D Semantic Scene Graphs to enable target-directed exploration by an autonomous agent in partially unknown environments.
Problem

Research questions and friction points this paper is trying to address.

semantic mapping
open-set recognition
robotic exploration
out-of-distribution objects
Vision-Language Foundation Models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Vision-Language Foundation Models
Open-set Semantic Mapping
3D Semantic Scene Graphs
Target-directed Exploration
Language-Image Embeddings
🔎 Similar Papers
No similar papers found.
F
Felix Igelbrink
German Research Center for Artificial Intelligence (DFKI), Research Department Cooperative and Autonomous Systems (CAS), Osnabrück, Germany.
L
Lennart Niecksch
German Research Center for Artificial Intelligence (DFKI), Research Department Cooperative and Autonomous Systems (CAS), Osnabrück, Germany.; Osnabrück University, Semantic Information Systems Group, Osnabrück, Germany.
M
Marian Renz
German Research Center for Artificial Intelligence (DFKI), Research Department Cooperative and Autonomous Systems (CAS), Osnabrück, Germany.; Osnabrück University, Semantic Information Systems Group, Osnabrück, Germany.
Martin Günther
Martin Günther
DFKI Cooperative and Autonomous Systems, Osnabrück
Artificial IntelligenceRobotics3D Object RecognitionSemantic MappingContext in Perception
Martin Atzmueller
Martin Atzmueller
Professor - Osnabrück University & Scientific Director - German Research Center for AI (DFKI)
complex dataexplainable AIinterpretabilitymachine perceptionsemantic modeling