Cycle Training with Semi-Supervised Domain Adaptation: Bridging Accuracy and Efficiency for Real-Time Mobile Scene Detection

📅 2025-04-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of achieving both high accuracy and computational efficiency for real-time scene detection on resource-constrained mobile devices. To this end, we propose a novel three-stage cyclic training framework that integrates exploration and stabilization mechanisms. We pioneer the incorporation of semi-supervised domain adaptation (SSDA) into mobile-device training, enabling synergistic exploitation of knowledge from large pre-trained models and unlabeled target-domain data. Coupled with a lightweight network architecture and CPU-optimized inference, our approach achieves 94.00% Top-1 and 99.17% Top-3 accuracy on the CamSSD dataset, with only 1.61 ms latency per frame on CPU—satisfying stringent on-device real-time requirements. Our core contributions are: (i) the first SSDA paradigm explicitly designed for mobile deployment; (ii) a scalable, cyclic training framework; and (iii) an end-to-end lightweight solution delivering state-of-the-art accuracy–latency trade-offs.

Technology Category

Application Category

📝 Abstract
Nowadays, smartphones are ubiquitous, and almost everyone owns one. At the same time, the rapid development of AI has spurred extensive research on applying deep learning techniques to image classification. However, due to the limited resources available on mobile devices, significant challenges remain in balancing accuracy with computational efficiency. In this paper, we propose a novel training framework called Cycle Training, which adopts a three-stage training process that alternates between exploration and stabilization phases to optimize model performance. Additionally, we incorporate Semi-Supervised Domain Adaptation (SSDA) to leverage the power of large models and unlabeled data, thereby effectively expanding the training dataset. Comprehensive experiments on the CamSSD dataset for mobile scene detection demonstrate that our framework not only significantly improves classification accuracy but also ensures real-time inference efficiency. Specifically, our method achieves a 94.00% in Top-1 accuracy and a 99.17% in Top-3 accuracy and runs inference in just 1.61ms using CPU, demonstrating its suitability for real-world mobile deployment.
Problem

Research questions and friction points this paper is trying to address.

Balancing accuracy and computational efficiency for mobile AI
Leveraging unlabeled data via semi-supervised domain adaptation
Enabling real-time scene detection on resource-limited smartphones
Innovation

Methods, ideas, or system contributions that make the work stand out.

Cycle Training with three-stage optimization process
Semi-Supervised Domain Adaptation for data expansion
Real-time mobile inference with high accuracy
🔎 Similar Papers
No similar papers found.
H
Huu-Phong Phan-Nguyen
Ho Chi Minh University of Technology, VNU-HCM, Vietnam
Anh Dao
Anh Dao
Undergraduate Student, Michigan State University
Vision-languageMultimodal LLMEmbodied AILLM
T
Tien-Huy Nguyen
Ho Chi Minh University of Technology, VNU-HCM, Vietnam
T
Tuan Quang
LPL Financial, USA
H
Huu-Loc Tran
Ho Chi Minh University of Technology, VNU-HCM, Vietnam
T
Tinh-Anh Nguyen-Nhu
Ho Chi Minh University of Technology, VNU-HCM, Vietnam
Huy Pham
Huy Pham
Aarhus University
RoboticsAutonomous NavigationArtificial Intelligence
Q
Quan Nguyen
Posts and Telecommunications Institute of Technology, Hanoi, Vietnam
H
Hoang M. Le
York University, Canada
Q
Q. Dinh
AI VIETNAM Lab