Unifying Deep Predicate Invention with Pre-trained Foundation Models

📅 2025-12-19

📈 Citations: 0

✨ Influential: 0

career value

208K/year

🤖 AI Summary

In long-horizon robotic tasks, sparse rewards and continuous state-action spaces impede symbolic modeling, as existing approaches either rely solely on LLM prompting—lacking empirical grounding—or learn exclusively from demonstrations—missing high-level semantic priors. This paper proposes UniPred, the first unified framework integrating LLM-guided and perception-driven predicate invention. It features a two-tier architecture: (i) an upper tier where an LLM generates interpretable predicate hypotheses, and (ii) a lower tier that grounds these predicates via vision foundation model features and neural predicate classifiers. A closed-loop collaboration mechanism and non-STRIPS predicate evaluation are introduced to synergistically fuse semantic priors with experience-based learning. Evaluated on five simulated and one real-robot task, UniPred achieves 2–4× higher task success rates than pure LLM-based methods and 3–4× improved sample efficiency over purely data-driven approaches.

Technology Category

Application Category

📝 Abstract

Long-horizon robotic tasks are hard due to continuous state-action spaces and sparse feedback. Symbolic world models help by decomposing tasks into discrete predicates that capture object properties and relations. Existing methods learn predicates either top-down, by prompting foundation models without data grounding, or bottom-up, from demonstrations without high-level priors. We introduce UniPred, a bilevel learning framework that unifies both. UniPred uses large language models (LLMs) to propose predicate effect distributions that supervise neural predicate learning from low-level data, while learned feedback iteratively refines the LLM hypotheses. Leveraging strong visual foundation model features, UniPred learns robust predicate classifiers in cluttered scenes. We further propose a predicate evaluation method that supports symbolic models beyond STRIPS assumptions. Across five simulated and one real-robot domains, UniPred achieves 2-4 times higher success rates than top-down methods and 3-4 times faster learning than bottom-up approaches, advancing scalable and flexible symbolic world modeling for robotics.

Problem

Research questions and friction points this paper is trying to address.

Unifies top-down and bottom-up predicate learning for robotics

Learns robust symbolic world models from low-level data and LLMs

Enables scalable symbolic reasoning beyond STRIPS assumptions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Unified bilevel framework combining LLM proposals with neural learning

Leverages visual foundation models for robust predicate classifiers

Introduces evaluation method extending symbolic models beyond STRIPS

🔎 Similar Papers

No similar papers found.