When Is Prior Knowledge Helpful? Exploring the Evaluation and Selection of Unsupervised Pretext Tasks from a Neuro-Symbolic Perspective

📅 2025-08-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Pretext task selection in neural-symbolic learning (NeSy) and semi-/self-supervised learning (SSL) remains heuristic-driven, theoretically underpinned, and difficult to evaluate *a priori*. Method: We propose the first unified evaluation framework grounded in neural-symbolic theory, extending neural-symbolic methods to settings with unreliable prior knowledge. We introduce a tripartite theoretical model—*learnability*, *reliability*, and *completeness* of symbolic knowledge—and design a quantifiable, low-data-dependent method for assessing pretext task efficacy. Contribution/Results: Our framework requires only minimal labeled data to accurately predict large-scale SSL/semi-supervised performance. Empirical evaluation demonstrates strong correlation (Pearson’s *r* > 0.92) between predicted and actual downstream accuracy. It significantly improves pretext task selection efficiency and enhances downstream model optimization, marking a critical transition from empirically guided to theory-driven NeSy and SSL design.

Technology Category

Application Category

📝 Abstract
Neuro-symbolic (Nesy) learning improves the target task performance of models by enabling them to satisfy knowledge, while semi/self-supervised learning (SSL) improves the target task performance by designing unsupervised pretext tasks for unlabeled data to make models satisfy corresponding assumptions. We extend the Nesy theory based on reliable knowledge to the scenario of unreliable knowledge (i.e., assumptions), thereby unifying the theoretical frameworks of SSL and Nesy. Through rigorous theoretical analysis, we demonstrate that, in theory, the impact of pretext tasks on target performance hinges on three factors: knowledge learnability with respect to the model, knowledge reliability with respect to the data, and knowledge completeness with respect to the target. We further propose schemes to operationalize these theoretical metrics, and thereby develop a method that can predict the effectiveness of pretext tasks in advance. This will change the current status quo in practical applications, where the selections of unsupervised tasks are heuristic-based rather than theory-based, and it is difficult to evaluate the rationality of unsupervised pretext task selection before testing the model on the target task. In experiments, we verify a high correlation between the predicted performance-estimated using minimal data-and the actual performance achieved after large-scale semi-supervised or self-supervised learning, thus confirming the validity of the theory and the effectiveness of the evaluation method.
Problem

Research questions and friction points this paper is trying to address.

Evaluating impact of pretext tasks on target performance
Unifying SSL and neuro-symbolic learning theoretical frameworks
Predicting pretext task effectiveness before implementation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Extends neuro-symbolic theory to unreliable knowledge
Proposes metrics for knowledge learnability, reliability, completeness
Develops method predicting pretext task effectiveness in advance
🔎 Similar Papers
No similar papers found.
Lin-Han Jia
Lin-Han Jia
LAMDA Group, Nanjing University
Machine Learning
S
Si-Yu Han
National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China; School of Artificial Intelligence, Nanjing University, Nanjing, China
Wen-Chao Hu
Wen-Chao Hu
Nanjing University
Machine LearningNeuro-Symbolic Learning
Jie-Jing Shao
Jie-Jing Shao
Nanjing University
Machine LearningNeuro-Symbolic LearningReinforcement Learning
W
Wen-Da Wei
National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China; School of Artificial Intelligence, Nanjing University, Nanjing, China
Z
Zhi Zhou
National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China; School of Computer Science, Nanjing University, Nanjing, China
Lan-Zhe Guo
Lan-Zhe Guo
LAMDA Group, Nanjing University
Machine Learning
Yu-Feng Li
Yu-Feng Li
Professor, Nanjing University
Machine Learning