CERNet: Class-Embedding Predictive-Coding RNN for Unified Robot Motion, Recognition, and Confidence Estimation

📅 2025-12-07

📈 Citations: 0

✨ Influential: 0

career value

199K/year

🤖 AI Summary

This work addresses the unified modeling challenge of real-time motion generation, intent recognition, and confidence estimation in human–robot collaboration. We propose CERNet, a novel architecture integrating class embedding with a hierarchical Predictive Coding Recurrent Neural Network (PC-RNN). Its key contributions are threefold: (1) it is the first to unify generative and discriminative inference end-to-end by jointly leveraging class embeddings and predictive coding; (2) it inherently quantifies model uncertainty via internal prediction errors, eliminating the need for auxiliary confidence modules; and (3) it supports online learning and robust, interference-resilient motion reproduction. Evaluated on 26 kinesthetic-teaching letter tasks, CERNet reduces trajectory reconstruction error by 76% compared to a single-layer baseline, and achieves online behavior recognition accuracies of 68% (Top-1) and 81% (Top-2), demonstrating both effectiveness and robustness.

Technology Category

Application Category

📝 Abstract

Robots interacting with humans must not only generate learned movements in real-time, but also infer the intent behind observed behaviors and estimate the confidence of their own inferences. This paper proposes a unified model that achieves all three capabilities within a single hierarchical predictive-coding recurrent neural network (PC-RNN) equipped with a class embedding vector, CERNet, which leverages a dynamically updated class embedding vector to unify motor generation and recognition. The model operates in two modes: generation and inference. In the generation mode, the class embedding constrains the hidden state dynamics to a class-specific subspace; in the inference mode, it is optimized online to minimize prediction error, enabling real-time recognition. Validated on a humanoid robot across 26 kinesthetically taught alphabets, our hierarchical model achieves 76% lower trajectory reproduction error than a parameter-matched single-layer baseline, maintains motion fidelity under external perturbations, and infers the demonstrated trajectory class online with 68% Top-1 and 81% Top-2 accuracy. Furthermore, internal prediction errors naturally reflect the model's confidence in its recognition. This integration of robust generation, real-time recognition, and intrinsic uncertainty estimation within a compact PC-RNN framework offers a compact and extensible approach to motor memory in physical robots, with potential applications in intent-sensitive human-robot collaboration.

Problem

Research questions and friction points this paper is trying to address.

Unify robot motion generation and real-time intent recognition

Estimate confidence in recognition using internal prediction errors

Maintain motion fidelity under external perturbations during operation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical predictive-coding RNN with class embedding

Dual-mode operation for generation and real-time inference

Internal prediction errors as intrinsic confidence estimation

🔎 Similar Papers

What Foundation Models can Bring for Robot Learning in Manipulation : A Survey