🤖 AI Summary
This work investigates the complementary mechanisms between textual and graph-structured representations in relational reasoning tasks and their implications for hybrid modeling. We propose Knowledge Co-Distillation (CoD), a unified framework that jointly optimizes a text encoder and a graph neural network, dynamically tracking the evolution of their latent spaces across five relational reasoning tasks. Through representation analysis, we characterize stage-wise alignment and divergence patterns, systematically identifying the conditions and intrinsic drivers underlying modality complementarity. Experiments demonstrate that CoD not only improves multi-task performance but also yields interpretable insights into cross-modal synergy: text enhances semantic generalization, while graph structure enforces logical constraints—complementarity peaks at specific training stages. To our knowledge, this is the first interpretability framework for multimodal relational reasoning grounded in latent-space dynamics.
📝 Abstract
Relational reasoning lies at the core of many NLP tasks, drawing on complementary signals from text and graphs. While prior research has investigated how to leverage this dual complementarity, a detailed and systematic understanding of text-graph interplay and its effect on hybrid models remains underexplored. We take an analysis-driven approach to investigate text-graph representation complementarity via a unified architecture that supports knowledge co-distillation (CoD). We explore five tasks involving relational reasoning that differ in how text and graph structures encode the information needed to solve that task. By tracking how these dual representations evolve during training, we uncover interpretable patterns of alignment and divergence, and provide insights into when and why their integration is beneficial.