🤖 AI Summary
This work addresses the challenges of building efficient and lightweight text mining models under label-scarce and resource-constrained conditions, where conventional semi-supervised learning methods often suffer from high computational overhead and susceptibility to local optima. To overcome these limitations, the authors propose NanoNet, a novel framework that uniquely integrates online knowledge distillation, mutual learning regularization, and parameter-efficient fine-tuning to collaboratively train multiple lightweight submodels. This synergistic approach substantially reduces reliance on labeled data and training costs while simultaneously enhancing both inference efficiency and model performance. NanoNet thus offers an effective solution for low-supervision, low-latency text mining scenarios.
📝 Abstract
The lightweight semi-supervised learning (LSL) strategy provides an effective approach of conserving labeled samples and minimizing model inference costs. Prior research has effectively applied knowledge transfer learning and co-training regularization from large to small models in LSL. However, such training strategies are computationally intensive and prone to local optima, thereby increasing the difficulty of finding the optimal solution. This has prompted us to investigate the feasibility of integrating three low-cost scenarios for text mining tasks: limited labeled supervision, lightweight fine-tuning, and rapid-inference small models. We propose NanoNet, a novel framework for lightweight text mining that implements parameter-efficient learning with limited supervision. It employs online knowledge distillation to generate multiple small models and enhances their performance through mutual learning regularization. The entire process leverages parameter-efficient learning, reducing training costs and minimizing supervision requirements, ultimately yielding a lightweight model for downstream inference.