π€ AI Summary
Black-box NLP models (e.g., GPT, BERT) lack sufficient interpretability for high-stakes applications in healthcare, finance, and customer management. Method: This paper conducts cross-domain eXplainable Natural Language Processing (XNLP) practice research, integrating model-agnostic explanation techniques (LIME/SHAP), attention visualization, counterfactual reasoning, and human factors evaluation experiments to establish, for the first time, a comprehensive XNLP practice framework aligned with real-world deployment scenarios. Contributions: (1) Proposes industry-tailored XNLP design principles and evaluation guidelines; (2) Systematically identifies and addresses three critical research gapsβreal-world applicability, human-AI collaborative evaluation, and quantitative explainability metrics; and (3) Clarifies domain-specific core explanation requirements, thereby advancing XNLP from theoretical validation toward high-reliability, production-ready deployment.
π Abstract
Natural Language Processing (NLP) has become a cornerstone in many critical sectors, including healthcare, finance, and customer relationship management. This is especially true with the development and use of advanced models such as GPT-based architectures and BERT, which are widely used in decision-making processes. However, the black-box nature of these advanced NLP models has created an urgent need for transparency and explainability. This review explores explainable NLP (XNLP) with a focus on its practical deployment and real-world applications, examining its implementation and the challenges faced in domain-specific contexts. The paper underscores the importance of explainability in NLP and provides a comprehensive perspective on how XNLP can be designed to meet the unique demands of various sectors, from healthcare's need for clear insights to finance's emphasis on fraud detection and risk assessment. Additionally, this review aims to bridge the knowledge gap in XNLP literature by offering a domain-specific exploration and discussing underrepresented areas such as real-world applicability, metric evaluation, and the role of human interaction in model assessment. The paper concludes by suggesting future research directions that could enhance the understanding and broader application of XNLP.