Closing the Loop in Teleoperation: Episode-Level Data Quality Assessment and Feedback for High-Quality Demonstration Collection

📅 2026-05-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Novice teleoperators often produce demonstration data that, while task-successful, exhibit poor quality—such as inefficient motions, frequent corrections, or proximity to joint limits—rendering them inadequate for downstream learning. To address this, this work proposes the Data Quality Assessment and Feedback (DQAF) framework, which integrates multimodal signals including subtask progress, motion smoothness, stalling behavior, and joint limit proximity to generate, after each operation, a structured quality score accompanied by interpretable natural language feedback. Moving beyond conventional binary success/failure judgments, DQAF enables the first fine-grained quality evaluation and closed-loop guidance tailored specifically to teleoperation. Experimental results demonstrate strong alignment between system-generated feedback and human expert assessments, and show that novices receiving such feedback significantly improve both the quality and efficiency of their subsequent demonstrations.
📝 Abstract
Industrial automation is at a pivotal moment, as Physical AI is driving a transition from rigid, hand-engineered automation systems toward more flexible and adaptive systems. This shift has created a growing demand for large-scale, real-world robot demonstration data, making teleoperation an increasingly important mechanism for data collection. However, high-quality teleoperated demonstrations remain difficult to obtain in practice, as novice operators often produce episodes that are task-successful but suboptimal for downstream use due to inefficient motion, repeated corrections, or operation near robot joint limits. We present a Data Quality Assessment and Feedback (DQAF) framework that closes the loop in teleoperation by providing immediate post-episode feedback grounded in semantic task progress and robot telemetry. The framework extracts quality relevant signals such as sub-task progress, motion smoothness, stalls, kinematic limits and converts them into structured quality assessments and actionable natural-language feedback. Unlike binary success or failure feedback, the proposed system explains why an episode is suboptimal and highlights specific behaviors to correct in the next trial. We evaluate the framework through a diagnostic validation study and a pilot user study. In the validation study, the system is compared with a human reviewer during dataset curation, producing rejection reasons and actionable feedback for improvement. In the pilot study with three novice operators across two manipulation tasks, the operator who received the systems immediate, automated post-episode feedback improved faster than those who did not, producing higher-quality demonstrations sooner.
Problem

Research questions and friction points this paper is trying to address.

teleoperation
data quality
robot demonstration
episode-level assessment
Physical AI
Innovation

Methods, ideas, or system contributions that make the work stand out.

teleoperation
data quality assessment
robot demonstration
feedback system
episode-level evaluation