🤖 AI Summary
To address the dual challenges of domain knowledge deficiency and degraded instruction-following capability in large language models (LLMs) for mortgage finance, this paper proposes a dual-track specialization framework. Methodologically, it introduces a dual-expert architecture—comprising dialogue and structured-task experts—and pioneers an “expert self-routing” mechanism, enabling each expert to autonomously classify few-shot tasks. Additionally, instruction residual adaptation is incorporated to precisely restore instruction-following ability after domain adaptation. Built upon LLaMA-3.1-8B, the framework integrates domain-specific pretraining, residual instruction transfer, supervised fine-tuning (SFT), direct preference optimization (DPO), and task routing optimization. Experiments on three mortgage-finance benchmarks—summarization, question answering, and classification—achieve scores of 4.58 (+0.59), 4.09 (+0.09), and 2.6 (+1.4), respectively, with consistent BERTScore superiority over baselines. Results validate the effectiveness of synergistically enhancing domain knowledge and aligning instruction fidelity.
📝 Abstract
Large Language Models (LLMs) demonstrate exceptional capabilities across general domains, yet their application to specialized sectors such as mortgage finance requires domain-specific knowledge augmentation while preserving instruction-following fidelity. We present MortgageLLM, a novel domain-specific large language model that addresses this dual challenge. It is developed using a dual-track specialization framework from a single base model (LLaMA-3.1-8B). We opted for this dual-expert approach as a single multi-task model suffers from performance trade-offs, where optimizing for structured tasks (via SFT) degrades conversational fidelity (via DPO). Our dual-track method solves this by creating two specialists, allowing each to be optimally trained for its distinct capability. Our approach applies the instruction residual technique to restore instruction-following capabilities post-domain adaptation without supervised fine-tuning. We contribute: (1) application of this residual technique to the highly specialized mortgage finance domain; (2) a dual-expert architecture combining a conversational Q&A model and a structured task model for classification and summarization; and (3) an intelligent task routing mechanism using few-shot classification performed by one of the expert models itself. We validate our approach on domain-specific benchmarks, where our final model (MLM v2) significantly outperforms the base LLaMA-3.1-8B-Instruct, achieving an LLM-as-a-Judge summarization score of 4.58 (vs. 3.99), a Q&A score of 4.09 (vs. 4.0), and a classification score of 2.6 (vs. 1.2). On semantic similarity, our model achieved a BERTScore of 0.77 for summarization (vs. 0.74), 0.68 for Q&A (vs. 0.58), and 0.75 for classification (vs. 0.73), substantially outperforming baseline approaches.