Agentic Knowledge Distillation: Autonomous Training of Small Language Models for SMS Threat Detection

📅 2026-02-11

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

This work addresses the challenge of rapidly outdated threat data and the difficulty of continuously training efficient on-device models for SMS phishing detection. To overcome this, the authors propose an autonomous teacher–student framework leveraging large language models (LLMs)—such as Claude Opus 4.5 and GPT-5.2 Codex—to generate synthetic data. Through a closed-loop feedback mechanism and targeted refinement, the framework iteratively distills knowledge into compact student models (e.g., Qwen2.5-0.5B and SmolLM2-135M) without human intervention. Experimental results demonstrate that, under optimal configurations, the approach achieves 94.31% accuracy and 96.25% recall, substantially outperforming a direct preference optimization (DPO) baseline by approximately 40 percentage points in accuracy, thereby validating its effectiveness and innovation for on-device security detection.

Technology Category

Application Category

📝 Abstract

SMS-based phishing (smishing) attacks have surged, yet training effective on-device detectors requires labelled threat data that quickly becomes outdated. To deal with this issue, we present Agentic Knowledge Distillation, which consists of a powerful LLM acts as an autonomous teacher that fine-tunes a smaller student SLM, deployable for security tasks without human intervention. The teacher LLM autonomously generates synthetic data and iteratively refines a smaller on-device student model until performance plateaus. We compare four LLMs in this teacher role (Claude Opus 4.5, GPT 5.2 Codex, Gemini 3 Pro, and DeepSeek V3.2) on SMS spam/smishing detection with two student SLMs (Qwen2.5-0.5B and SmolLM2-135M). Our results show that performance varies substantially depending on the teacher LLM, with the best configuration achieving 94.31% accuracy and 96.25% recall. We also compare against a Direct Preference Optimisation (DPO) baseline that uses the same synthetic knowledge and LoRA setup but without iterative feedback or targeted refinement; agentic knowledge distillation substantially outperforms it (e.g. 86-94% vs 50-80% accuracy), showing that closed-loop feedback and targeted refinement are critical. These findings demonstrate that agentic knowledge distillation can rapidly yield effective security classifiers for edge deployment, but outcomes depend strongly on which teacher LLM is used.

Problem

Research questions and friction points this paper is trying to address.

SMS threat detection

smishing

on-device detection

labelled data scarcity

small language models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Agentic Knowledge Distillation

Autonomous Teacher

On-device SLM