🤖 AI Summary
This work addresses the limited generalization, poor interpretability, and low controllability of Transformer language models by proposing a novel paradigm that explicitly embeds and disentangles reasoning rules in the latent space. Methodologically, we design a Transformer-based language variational autoencoder (VAE), modeling reasoning rules as functional mappings and achieving rule disentanglement and feature clustering within the encoder’s feed-forward network (FFN) parameter space. We further introduce rule-specific supervision signals and a key-value memory mechanism, enhancing the decoder’s ability to retrieve rule knowledge via query-based injection of prior information. Experiments demonstrate effective rule disentanglement, with FFN layers proving more conducive than attention layers to preserving rule separability. The approach is validated on mathematical reasoning tasks, confirming improved reasoning fidelity; however, performance exhibits saturation with increasing training sample size, revealing a fundamental data-efficiency bottleneck.
📝 Abstract
Incorporating explicit reasoning rules within the latent space of language models (LMs) offers a promising pathway to enhance generalisation, interpretability, and controllability. While current Transformer-based language models have shown strong performance on Natural Language Inference (NLI) tasks, they often rely on memorisation rather than rule-based inference. This work investigates how reasoning rules can be explicitly embedded and memorised within the LMs through Language Variational Autoencoders (VAEs). We propose a complete pipeline for learning reasoning rules within Transformer-based language VAEs. This pipeline encompasses three rule-based reasoning tasks, a supporting theoretical framework, and a practical end-to-end architecture. The experiment illustrates the following findings: Disentangled reasoning: Under explicit signal supervision, reasoning rules - viewed as functional mappings - can be disentangled within the encoder's parametric space. This separation results in distinct clustering of rules in the output feature space. Prior knowledge injection: injecting reasoning information into the Query enables the model to more effectively retrieve the stored value Value from memory based on Key. This approach offers a simple method for integrating prior knowledge into decoder-only language models. Performance bottleneck: In mathematical reasoning tasks using Qwen2.5(0.5B), increasing sample count doesn't improve performance beyond a point. Moreover, ffn layers are better than attention layers at preserving the separation of reasoning rules in the model's parameters.