What Questions Should Robots Be Able to Answer? A Dataset of User Questions for Explainable Robotics

📅 2025-10-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current explainable robotics research overemphasizes “why”-type questions and lacks systematic characterization of users’ authentic question-answering needs. Method: We construct the first large-scale, structured user-question dataset for home scenarios—comprising 1,893 questions—spanning 15 video and 7 text-based contexts, categorized into 12 macro-classes and 70 fine-grained types. Conducted via Prolific, our user study employed stimulus materials to elicit natural questions, followed by expert classification, coding, and importance scoring. Contribution/Results: We extend question typologies beyond “why” to include execution details, hypothetical scenario responses, and capability boundaries—revealing significant differences between novice and experienced users. Most frequent question categories are task execution details (22.5%), robot capabilities (12.7%), and performance evaluation (11.3%); questions about handling difficult scenarios are rated highest in importance. This dataset provides an empirical foundation for robot log design, QA system evaluation, and explanation strategy development.

Technology Category

Application Category

📝 Abstract
With the growing use of large language models and conversational interfaces in human-robot interaction, robots' ability to answer user questions is more important than ever. We therefore introduce a dataset of 1,893 user questions for household robots, collected from 100 participants and organized into 12 categories and 70 subcategories. Most work in explainable robotics focuses on why-questions. In contrast, our dataset provides a wide variety of questions, from questions about simple execution details to questions about how the robot would act in hypothetical scenarios -- thus giving roboticists valuable insights into what questions their robot needs to be able to answer. To collect the dataset, we created 15 video stimuli and 7 text stimuli, depicting robots performing varied household tasks. We then asked participants on Prolific what questions they would want to ask the robot in each portrayed situation. In the final dataset, the most frequent categories are questions about task execution details (22.5%), the robot's capabilities (12.7%), and performance assessments (11.3%). Although questions about how robots would handle potentially difficult scenarios and ensure correct behavior are less frequent, users rank them as the most important for robots to be able to answer. Moreover, we find that users who identify as novices in robotics ask different questions than more experienced users. Novices are more likely to inquire about simple facts, such as what the robot did or the current state of the environment. As robots enter environments shared with humans and language becomes central to giving instructions and interaction, this dataset provides a valuable foundation for (i) identifying the information robots need to log and expose to conversational interfaces, (ii) benchmarking question-answering modules, and (iii) designing explanation strategies that align with user expectations.
Problem

Research questions and friction points this paper is trying to address.

Identifying diverse user questions for household robots to answer
Creating a dataset to benchmark robot question-answering capabilities
Understanding user expectations for explainable robotics systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

Created dataset of 1,893 user questions for household robots
Used video and text stimuli to collect diverse question types
Provides foundation for benchmarking robot question-answering capabilities
🔎 Similar Papers
No similar papers found.