Small Models, Big Results: Achieving Superior Intent Extraction through Decomposition

📅 2025-09-15

📈 Citations: 0

✨ Influential: 0

career value

211K/year

🤖 AI Summary

Resource-constrained edge devices face challenges in accurately understanding user intent from UI interaction traces, while simultaneously ensuring privacy preservation and real-time responsiveness. Method: This paper proposes a two-stage decomposed architecture: (1) generating structured sequential summaries of interaction behaviors, followed by (2) lightweight intent inference based on these summaries. The approach integrates context-aggregated enhancement and task-adaptive fine-tuning to strengthen semantic modeling capabilities of small models. Contribution/Results: Experimental results demonstrate that, under identical privacy guarantees and low-latency constraints, the proposed method achieves higher intent recognition accuracy than state-of-the-art large multimodal language models. It establishes an efficient, privacy-aware, and real-time interaction understanding paradigm for on-device intelligent agents.

Technology Category

Application Category

📝 Abstract

Understanding user intents from UI interaction trajectories remains a challenging, yet crucial, frontier in intelligent agent development. While massive, datacenter-based, multi-modal large language models (MLLMs) possess greater capacity to handle the complexities of such sequences, smaller models which can run on-device to provide a privacy-preserving, low-cost, and low-latency user experience, struggle with accurate intent inference. We address these limitations by introducing a novel decomposed approach: first, we perform structured interaction summarization, capturing key information from each user action. Second, we perform intent extraction using a fine-tuned model operating on the aggregated summaries. This method improves intent understanding in resource-constrained models, even surpassing the base performance of large MLLMs.

Problem

Research questions and friction points this paper is trying to address.

Improving intent extraction accuracy in small on-device models

Overcoming limitations of privacy-preserving low-latency models

Enhancing intent understanding through decomposed structured summarization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Decomposed approach for intent extraction

Structured interaction summarization technique

Fine-tuned model on aggregated summaries

🔎 Similar Papers

No similar papers found.