Learning Noise-Resilient and Transferable Graph-Text Alignment via Dynamic Quality Assessment

📅 2025-10-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing CLIP-style graph–text alignment methods suffer from two key limitations: (1) they enforce rigid one-to-one mapping assumptions, ignoring intrinsic many-to-many semantic relationships in graphs; and (2) they rely on static alignment objectives, yielding poor robustness under noisy supervision. This paper proposes ADAligner—the first dynamic supervision-aware graph–text alignment framework. ADAligner adaptively selects either subgraph-level many-to-many alignment (for high-quality data) or node-level one-to-one alignment (under noise) via batch-level alignment reliability estimation and dynamic filtering of low-confidence samples. It introduces a soft subgraph alignment loss and a tunable optimization objective, with theoretical guarantees establishing it as a stable negative-feedback system. Evaluated on nine text-attributed graph datasets, ADAligner achieves significant gains in zero-shot/few-shot classification, link prediction, and cross-modal retrieval, accelerates training by 2–3×, and demonstrates exceptional robustness to label noise.

Technology Category

Application Category

📝 Abstract
Pre-training Graph Foundation Models (GFMs) on text-attributed graphs (TAGs) is central to web-scale applications such as search, recommendation, and knowledge discovery. However, existing CLIP-style graph-text aligners face two key limitations: they assume strict one-to-one correspondences between nodes and texts, overlooking the inherent many-to-many relations in real-world graphs; and they rely on static alignment objectives that cannot adapt to varying data quality, making them brittle under noisy supervision. Together, these limitations expose a core dilemma: embracing expressive many-to-many alignment amplifies noise, while reverting to strict one-to-one strategies sacrifices semantic diversity and fails to handle inherently mismatched pairs. To address these challenges, we propose ADAligner, a dynamic, quality-aware graph-text alignment framework that dynamically adjusts between expressive many-to-many and conservative one-to-one objectives according to supervision quality. ADAligner estimates batch-level alignment reliability in real time and adapts its optimization accordingly, promoting soft, subgraph-level many-to-many alignment when supervision is clean, while emphasizing reliable one-to-one alignment by dynamically filtering low-confidence pairs under noise. Theoretically, we prove that this dynamic mechanism forms a stable negative feedback process, ensuring convergence and robustness. Comprehensive experiments on nine diverse TAG datasets demonstrate that ADAligner consistently outperforms prior graph-text aligners on zero-/few-shot node classification, link prediction and cross-modal retrieval tasks. It maintains strong robustness under noisy supervision and accelerates pre-training by approximately 2 to 3 times compared to multimodal baselines, establishing a scalable and reliable foundation for graph-text representation learning in real-world web environments.
Problem

Research questions and friction points this paper is trying to address.

Addressing noisy graph-text alignment by dynamically adjusting between many-to-many and one-to-one objectives
Overcoming limitations of static alignment methods that fail under varying data quality
Enhancing robustness and scalability of graph foundation models for real-world applications
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic quality assessment adjusts alignment objectives
Adaptive switching between many-to-many and one-to-one alignment
Real-time batch-level reliability estimation filters noisy pairs
🔎 Similar Papers
No similar papers found.
Yuhang Liu
Yuhang Liu
The University of Adelaide
Representation LearningLLMsLatent Variable ModelsResponsible AI
Minglai Shao
Minglai Shao
Tianjin University
Graph MiningDeep LearningMachine Learning
Zengyi Wo
Zengyi Wo
College of Intelligence and Computing, Tianjin University
Data MiningAnomaly DetectionLLM Reasoning
Y
Yunlong Chu
School of New Media and Communication, Tianjin University, Tianjin, China
B
Bing Hao
School of New Media and Communication, Tianjin University, Tianjin, China
Shengzhong Liu
Shengzhong Liu
Shanghai Jiao Tong University
R
Ruijie Wang
School of Computer Science and Engineering, Beihang University, Beijing, China
J
Jianxin Li
School of Computer Science and Engineering, Beihang University, Beijing, China