๐ค AI Summary
This study systematically investigates how LoRA fine-tuning enables large language models (Mistral-7B, LLaMA3.1-8B, Pythia-6.9B) to model and leverage relevance signals for paragraph re-ranking. To this end, we employ multi-rank (1/2/8/32) LoRA configurations, layer-wise behavior tracking, module-level ablation, and evaluation on MS MARCO. Our analysis uncovers the dynamic evolution of relevance modeling during adaptation. We identifyโ for the first timeโthe critical fine-tuned layers (mid-transformer blocks) and core subspaces (Q/K projections within multi-head attention) that dominantly govern re-ranking performance. Moreover, we reveal a nonlinear relationship between LoRA rank and module importance: low-rank adapters (e.g., rank 1 or 2) suffice to capture relevance effectively. These findings establish a new paradigm for interpretable, parameter-efficient adaptation in information retrieval. All models and analysis code are publicly released.
๐ Abstract
We conduct a behavioral exploration of LoRA fine-tuned LLMs for Passage Reranking to understand how relevance signals are learned and deployed by Large Language Models. By fine-tuning Mistral-7B, LLaMA3.1-8B, and Pythia-6.9B on MS MARCO under diverse LoRA configurations, we investigate how relevance modeling evolves across checkpoints, the impact of LoRA rank (1, 2, 8, 32), and the relative importance of updated MHA vs. MLP components. Our ablations reveal which layers and projections within LoRA transformations are most critical for reranking accuracy. These findings offer fresh explanations into LoRA's adaptation mechanisms, setting the stage for deeper mechanistic studies in Information Retrieval. All models used in this study have been shared.