COMMET: A System for Human-Induced Conflicts in Mobile Manipulation of Everyday Tasks

📅 2025-09-05

📈 Citations: 0

✨ Influential: 0

career value

263K/year

🤖 AI Summary

This paper addresses dynamic, socially embedded conflicts arising from human activities in home environments—conflicts that admit multiple valid resolutions and are highly dependent on individual user preferences. To tackle this challenge, we propose a personalized conflict detection and resolution system integrating multimodal retrieval with model-based reasoning. Our key contributions are: (1) a GPT-4o–driven preference induction mechanism that automatically distills personalized decision priors from historical interactions; (2) a low-confidence fallback reasoning module to enhance system robustness under uncertainty; and (3) an end-to-end pipeline enabling real-world deployment and scalable, continuous data collection. Experimental results demonstrate that our system outperforms pure large language model–based approaches in both detection accuracy and response latency. The framework establishes a novel paradigm for adaptive, preference-aware decision-making by embodied agents operating in open, human-robot cohabited environments.

Technology Category

Application Category

📝 Abstract

Continuous advancements in robotics and AI are driving the integration of robots from industry into everyday environments. However, dynamic and unpredictable human activities in daily lives would directly or indirectly conflict with robot actions. Besides, due to the social attributes of such human-induced conflicts, solutions are not always unique and depend highly on the user's personal preferences. To address these challenges and facilitate the development of household robots, we propose COMMET, a system for human-induced COnflicts in Mobile Manipulation of Everyday Tasks. COMMET employs a hybrid detection approach, which begins with multi-modal retrieval and escalates to fine-tuned model inference for low-confidence cases. Based on collected user preferred options and settings, GPT-4o will be used to summarize user preferences from relevant cases. In preliminary studies, our detection module shows better accuracy and latency compared with GPT models. To facilitate future research, we also design a user-friendly interface for user data collection and demonstrate an effective workflow for real-world deployments.

Problem

Research questions and friction points this paper is trying to address.

Detecting human-robot conflicts in daily tasks

Addressing non-unique solutions from social preferences

Improving accuracy and latency in conflict detection

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid detection with multi-modal retrieval

GPT-4o summarizes user preferences

User-friendly interface for data collection

🔎 Similar Papers

No similar papers found.