AFD-SLU: Adaptive Feature Distillation for Spoken Language Understanding

📅 2025-09-05

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

To address the dual challenges of scarce high-quality annotated data and high deployment costs of large language models in spoken language understanding (SLU), this paper proposes an adaptive feature distillation framework. The method employs a residual projection neural network (RPNN) to align heterogeneous feature spaces between a teacher model (based on GTE) and a lightweight student model, and introduces a dynamic distillation coefficient (DDC) mechanism that adaptively adjusts distillation intensity based on real-time performance feedback from intent classification and slot filling tasks. This design significantly enhances knowledge transfer efficiency and generalization capability. Evaluated on the Chinese ProSLU benchmark, the proposed approach achieves 95.67% intent accuracy, 92.02% slot F1-score, and 85.50% joint accuracy—setting new state-of-the-art results.

Technology Category

Application Category

📝 Abstract

Spoken Language Understanding (SLU) is a core component of conversational systems, enabling machines to interpret user utterances. Despite its importance, developing effective SLU systems remains challenging due to the scarcity of labeled training data and the computational burden of deploying Large Language Models (LLMs) in real-world applications. To further alleviate these issues, we propose an Adaptive Feature Distillation framework that transfers rich semantic representations from a General Text Embeddings (GTE)-based teacher model to a lightweight student model. Our method introduces a dynamic adapter equipped with a Residual Projection Neural Network (RPNN) to align heterogeneous feature spaces, and a Dynamic Distillation Coefficient (DDC) that adaptively modulates the distillation strength based on real-time feedback from intent and slot prediction performance. Experiments on the Chinese profile-based ProSLU benchmark demonstrate that AFD-SLU achieves state-of-the-art results, with 95.67% intent accuracy, 92.02% slot F1 score, and 85.50% overall accuracy.

Problem

Research questions and friction points this paper is trying to address.

Addresses scarcity of labeled SLU training data

Reduces computational burden of deploying LLMs

Aligns heterogeneous feature spaces for distillation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive Feature Distillation framework transfers semantic representations

Dynamic adapter with Residual Projection Neural Network aligns features

Dynamic Distillation Coefficient modulates strength using real-time feedback

🔎 Similar Papers

No similar papers found.

Authors to Follow