Retrieval Augmented Learning: A Retrial-based Large Language Model Self-Supervised Learning and Autonomous Knowledge Generation

📅 2025-05-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large language models (LLMs) suffer from degraded decision-making performance in domain-specific tasks due to scarce domain data, while conventional post-training approaches incur prohibitive computational costs. Method: We propose a zero-shot, reward-free retrieval-augmented learning framework featuring a closed-loop, three-stage reasoning architecture—hypothesis generation, verification, and autonomous knowledge construction. It introduces a novel retry-based self-supervised learning paradigm and reframes RAG as a verifiable intermediate knowledge organization module to explicitly mitigate hallucination. Contribution/Results: Evaluated in the LLM-PySC2 decision-making environment, our method substantially reduces hallucination rates and improves decision accuracy, with minimal computational overhead. Crucially, it demonstrates strong out-of-distribution (OOD) robustness and cross-domain transferability, enabling effective generalization without parameter updates or external reward signals.

Technology Category

Application Category

📝 Abstract
The lack of domain-specific data in the pre-training of Large Language Models (LLMs) severely limits LLM-based decision systems in specialized applications, while post-training a model in the scenarios requires significant computational resources. In this paper, we present Retrial-Augmented Learning (RAL), a reward-free self-supervised learning framework for LLMs that operates without model training. By developing Retrieval-Augmented Generation (RAG) into a module for organizing intermediate data, we realized a three-stage autonomous knowledge generation of proposing a hypothesis, validating the hypothesis, and generating the knowledge. The method is evaluated in the LLM-PySC2 environment, a representative decision-making platform that combines sufficient complexity with domain-specific knowledge requirements. Experiments demonstrate that the proposed method effectively reduces hallucination by generating and utilizing validated knowledge, and increases decision-making performance at an extremely low cost. Meanwhile, the approach exhibits potential in out-of-distribution(OOD) tasks, robustness, and transferability, making it a cost-friendly but effective solution for decision-making problems and autonomous knowledge generation.
Problem

Research questions and friction points this paper is trying to address.

Addresses lack of domain-specific data in LLM pre-training
Reduces computational cost for post-training in specialized scenarios
Minimizes hallucination by generating validated knowledge autonomously
Innovation

Methods, ideas, or system contributions that make the work stand out.

Reward-free self-supervised learning framework
Three-stage autonomous knowledge generation
Retrieval-Augmented Generation for data organization
🔎 Similar Papers
No similar papers found.
Z
Zongyuan Li
College of Artificial Intelligence, Nankai University
P
Pengfei Li
College of Artificial Intelligence, Nankai University
R
Runnan Qi
Laboratory for Big Data and Decision, National University of Defense Technology
Y
Yanan Ni
Laboratory for Big Data and Decision, National University of Defense Technology
L
Lumin Jiang
Laboratory for Big Data and Decision, National University of Defense Technology
H
Hui Wu
College of Artificial Intelligence, Nankai University
Xuebo Zhang
Xuebo Zhang
Ph. D, Professor, Institute of Robotics, Nankai Univeristy, China
Visual servoingmobile roboticsmotion planningSLAMgame AI
K
Kuihua Huang
Laboratory for Big Data and Decision, National University of Defense Technology
X
Xian Guo
College of Artificial Intelligence, Nankai University