Toward Ownership Understanding of Objects: Active Question Generation with Large Language Model and Probabilistic Generative Model

📅 2025-09-16

📈 Citations: 0

✨ Influential: 0

career value

184K/year

🤖 AI Summary

Robots operating in home or office environments must infer object ownership (e.g., “my cup”), yet visual features alone are insufficient for reliable identification. Method: We propose Active Ownership Learning—a novel framework integrating large language models (LLMs) with active learning. An LLM leverages commonsense reasoning to pre-assess object privacy likelihood, thereby narrowing the candidate set for querying. A probabilistic generative model, grounded in Bayesian inference and entropy minimization, then selects optimal ownership queries based on maximal information gain. Crucially, the system only actively queries objects exhibiting high ownership uncertainty. Contribution/Results: Evaluated in simulated domestic and real-world lab settings, our framework achieves higher ownership clustering accuracy than baseline methods while requiring significantly fewer queries. It thus enables efficient, socially aware robot behavior in ownership-sensitive tasks.

Technology Category

Application Category

📝 Abstract

Robots operating in domestic and office environments must understand object ownership to correctly execute instructions such as ``Bring me my cup.'' However, ownership cannot be reliably inferred from visual features alone. To address this gap, we propose Active Ownership Learning (ActOwL), a framework that enables robots to actively generate and ask ownership-related questions to users. ActOwL employs a probabilistic generative model to select questions that maximize information gain, thereby acquiring ownership knowledge efficiently to improve learning efficiency. Additionally, by leveraging commonsense knowledge from Large Language Models (LLM), objects are pre-classified as either shared or owned, and only owned objects are targeted for questioning. Through experiments in a simulated home environment and a real-world laboratory setting, ActOwL achieved significantly higher ownership clustering accuracy with fewer questions than baseline methods. These findings demonstrate the effectiveness of combining active inference with LLM-guided commonsense reasoning, advancing the capability of robots to acquire ownership knowledge for practical and socially appropriate task execution.

Problem

Research questions and friction points this paper is trying to address.

Robots actively generate ownership questions to users

Maximize information gain via probabilistic generative model

Leverage LLM commonsense to classify shared vs owned objects

Innovation

Methods, ideas, or system contributions that make the work stand out.

Active question generation using LLM

Probabilistic model for information gain

Pre-classification of objects via commonsense

🔎 Similar Papers

Emerging Pixel Grounding in Large Multimodal Models Without Grounding Supervision