Is Micro Domain-Adaptive Pre-Training Effective for Real-World Operations? Multi-Step Evaluation Reveals Potential and Bottlenecks

📅 2026-02-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the lack of systematic evaluation of micro-domain adaptive pretraining (mDAPT) in handling proprietary knowledge within real-world enterprise generative tasks. Focusing on IT support scenarios, the work decomposes generative question answering into three subtasks—fact extraction, reasoning, and answer generation—and establishes a fine-grained evaluation framework based on large language models, leveraging a proprietary product knowledge corpus for empirical analysis. The research reveals, for the first time, that mDAPT substantially enhances fact extraction but yields limited gains in reasoning and long-form text generation. Furthermore, it demonstrates that jointly optimizing fact extraction and reasoning achieves over 90% overall performance, underscoring the critical role of reasoning capability in system effectiveness.

Technology Category

Application Category

📝 Abstract
When applying LLMs to real-world enterprise operations, LLMs need to handle proprietary knowledge in small domains of specific operations ($\textbf{micro domains}$). A previous study shows micro domain-adaptive pre-training ($\textbf{mDAPT}$) with fewer documents is effective, similarly to DAPT in larger domains. However, it evaluates mDAPT only on multiple-choice questions; thus, its effectiveness for generative tasks in real-world operations remains unknown. We aim to reveal the potential and bottlenecks of mDAPT for generative tasks. To this end, we disentangle the answering process into three subtasks and evaluate the performance of each subtask: (1) $\textbf{eliciting}$ facts relevant to questions from an LLM's own knowledge, (2) $\textbf{reasoning}$ over the facts to obtain conclusions, and (3) $\textbf{composing}$ long-form answers based on the conclusions. We verified mDAPT on proprietary IT product knowledge for real-world questions in IT technical support operations. As a result, mDAPT resolved the elicitation task that the base model struggled with but did not resolve other subtasks. This clarifies mDAPT's effectiveness in the knowledge aspect and its bottlenecks in other aspects. Further analysis empirically shows that resolving the elicitation and reasoning tasks ensures sufficient performance (over 90%), emphasizing the need to enhance reasoning capability.
Problem

Research questions and friction points this paper is trying to address.

micro domain
domain-adaptive pre-training
generative tasks
proprietary knowledge
real-world operations
Innovation

Methods, ideas, or system contributions that make the work stand out.

micro domain-adaptive pre-training
multi-step evaluation
generative tasks
reasoning capability
proprietary knowledge
🔎 Similar Papers
No similar papers found.
M
Masaya Tsunokake
Research & Development Group, Hitachi, Ltd, Tokyo, Japan
Yuta Koreeda
Yuta Koreeda
Hitachi, Ltd., Hitachi America, Ltd., Stanford CS
natural language processingmachine learningrobotcomputer assisted surgery
Terufumi Morishita
Terufumi Morishita
Central Research Laboratory of Hitachi, ltd.
NLPRLML.
K
Koichi Nagatsuka
Research & Development Group, Hitachi, Ltd, Tokyo, Japan
H
Hikaru Tomonari
Research & Development Group, Hitachi, Ltd, Tokyo, Japan
Y
Yasuhiro Sogawa
Research & Development Group, Hitachi, Ltd, Tokyo, Japan