ValSub: Subsampling Validation Data to Mitigate Forgetting during ASR Personalization

📅 2025-03-12

📈 Citations: 0

✨ Influential: 0

career value

210K/year

🤖 AI Summary

ASR models suffer from catastrophic forgetting during on-device personalized fine-tuning, degrading source-domain generalization; conventional full-validation-set-based forgetting assessment is infeasible on resource-constrained edge devices due to prohibitive storage and computational overhead. This paper proposes a lightweight forgetting monitoring framework based on validation set sub-sampling: it jointly integrates distribution-matching-driven sub-sampling with dynamic forgetting quantification to construct a compact yet high-fidelity surrogate validation set, and further designs an adaptive early-stopping strategy to optimize fine-tuning epochs. Experiments demonstrate that, compared to same-size random subsets, our method reduces the mean absolute error of forgetting estimation by 10.3%–60.7%; moreover, across multiple forgetting thresholds, it consistently approximates the behavior of a 50× larger oracle (full) validation set.

Technology Category

Application Category

📝 Abstract

Automatic Speech Recognition (ASR) is widely used within consumer devices such as mobile phones. Recently, personalization or on-device model fine-tuning has shown that adaptation of ASR models towards target user speech improves their performance over rare words or accented speech. Despite these gains, fine-tuning on user data (target domain) risks the personalized model to forget knowledge about its original training distribution (source domain) i.e. catastrophic forgetting, leading to subpar general ASR performance. A simple and efficient approach to combat catastrophic forgetting is to measure forgetting via a validation set that represents the source domain distribution. However, such validation sets are large and impractical for mobile devices. Towards this, we propose a novel method to subsample a substantially large validation set into a smaller one while maintaining the ability to estimate forgetting. We demonstrate the efficacy of such a dataset in mitigating forgetting by utilizing it to dynamically determine the number of ideal fine-tuning epochs. When measuring the deviations in per user fine-tuning epochs against a 50x larger validation set (oracle), our method achieves a lower mean-absolute-error (3.39) compared to randomly selected subsets of the same size (3.78-8.65). Unlike random baselines, our method consistently tracks the oracle's behaviour across three different forgetting thresholds.

Problem

Research questions and friction points this paper is trying to address.

Mitigates catastrophic forgetting in ASR personalization

Subsamples validation data for mobile device efficiency

Dynamically determines ideal fine-tuning epochs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Subsampling validation set to reduce size

Dynamic fine-tuning epochs determination

Maintains source domain distribution representation

🔎 Similar Papers

Personalized Speech Recognition for Children with Test-Time Adaptation