🤖 AI Summary
This study addresses the challenge of long-term adaptation for autonomous robots in open, dynamic environments, where fixed learning frameworks often fail to cope with continuous change. The authors propose a bidirectional cognition–learning co-evolution mechanism: a cognition module guides the learning process by detecting environmental shifts, selecting relevant evidence, organizing training data, and planning validation, while the learning module reciprocally enriches cognition by updating knowledge, strategies, and reasoning capabilities. This framework enables autonomous discovery of input features, incremental expansion of output categories, online model updating, and restructuring of action routines. Experimental results demonstrate substantial improvements—recognition accuracy increases from 0.419 to 0.845, success rates for forming new categories and updating models rise significantly, average action sequence length decreases from 13.0 to 4.0, and effective evidence selection reaches 0.965—collectively achieving genuine open-ended autonomous learning.
📝 Abstract
Autonomous robots operating in open and changing environments cannot always rely on predefined inputs, outputs, and action routines. Although existing learning methods enable robots to improve their performance through environmental interaction, the objects of learning are often fixed in advance, such as input features, recognition outputs, network structures, task goals, or action sequences. This limits their ability to adapt when new features, new categories, or more efficient task routines appear during long-term operation. To address this problem, this paper proposes a thinking-learning interaction model for autonomous robots. The core idea is that thinking guides learning by identifying potential changes, selecting useful evidence, organizing training materials, and planning verification actions, while learning promotes thinking by updating task knowledge, feature-selection experience, action strategies, and future reasoning processes. Based on this bidirectional mechanism, the robot can gradually move beyond predefined learning settings and adapt its recognition relations and action relations through continuous interaction with the environment. Specifically, the proposed model supports adaptive input feature discovery, output category expansion, learning model update, and action routine reconstruction. Experimental results show that the proposed model improves the final recognition accuracy from 0.419 to 0.845 in feature adaptation, achieves higher new-category formation accuracy and model-update success rate, and reduces the average action length from 13.0 to 4.0 in action routine reconstruction. In learning-enhanced thinking, the useful evidence selection rate increases from 0.272 to 0.965, indicating that learning results can effectively improve future evidence selection and reasoning.