Structure-Preserving Graph Contrastive Learning for Mathematical Information Retrieval

📅 2026-03-09

📈 Citations: 0

✨ Influential: 0

career value

168K/year

🤖 AI Summary

This work addresses the limitation of generic graph augmentation strategies in standard graph contrastive learning, which often disrupt the semantic structure of mathematical expression graphs—particularly for small-scale, structurally compact formulas. To mitigate this issue, the authors propose a domain-specific augmentation technique tailored for mathematical information retrieval: Variable Substitution. This approach preserves the core algebraic structure and semantic meaning of formulas within a graph contrastive learning framework. Notably, it introduces, for the first time, a structure-preserving variable substitution mechanism into graph contrastive learning, effectively alleviating semantic distortion. Experimental results demonstrate that the proposed method significantly outperforms conventional augmentation strategies when integrated into established graph contrastive retrieval models, yielding substantial improvements in mathematical formula retrieval performance.

Technology Category

Application Category

📝 Abstract

This paper introduces Variable Substitution as a domain-specific graph augmentation technique for graph contrastive learning (GCL) in the context of searching for mathematical formulas. Standard GCL augmentation techniques often distort the semantic meaning of mathematical formulas, particularly for small and highly structured graphs. Variable Substitution, on the other hand, preserves the core algebraic relationships and formula structure. To demonstrate the effectiveness of our technique, we apply it to a classic GCL-based retrieval model. Experiments show that this straightforward approach significantly improves retrieval performance compared to generic augmentation strategies. We release the code on GitHub.\footnote{https://github.com/lazywulf/formula_ret_aug}.

Problem

Research questions and friction points this paper is trying to address.

Graph Contrastive Learning

Mathematical Information Retrieval

Graph Augmentation

Formula Structure

Semantic Preservation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Variable Substitution

Graph Contrastive Learning

Mathematical Information Retrieval