Task-oriented Age of Information for Remote Inference with Hybrid Language Models

📅 2025-04-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the fundamental trade-off between accuracy and timeliness in remote inference systems. To jointly capture task-specific utility and information freshness, we propose Task-Aware Age of Information (TAoI), a novel metric that explicitly couples temporal staleness with downstream task objectives. To minimize TAoI, we design an LLM/SLM hybrid inference system and jointly optimize three interdependent components: adaptive image resolution selection, task-aware model routing (LLM vs. SLM), and wireless transmission scheduling. We theoretically establish that the optimal policy exhibits a threshold structure and formulate the problem as a Semi-Markov Decision Process (SMDP). A relative policy iteration algorithm is developed for efficient solution. Simulation results demonstrate that our approach significantly outperforms baseline methods in the accuracy–timeliness trade-off: TAoI is reduced by 37%, while end-to-end latency and error rate are simultaneously optimized.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) have revolutionized the field of artificial intelligence (AI) through their advanced reasoning capabilities, but their extensive parameter sets introduce significant inference latency, posing a challenge to ensure the timeliness of inference results. While Small Language Models (SLMs) offer faster inference speeds with fewer parameters, they often compromise accuracy on complex tasks. This study proposes a novel remote inference system comprising a user, a sensor, and an edge server that integrates both model types alongside a decision maker. The system dynamically determines the resolution of images transmitted by the sensor and routes inference tasks to either an SLM or LLM to optimize performance. The key objective is to minimize the Task-oriented Age of Information (TAoI) by jointly considering the accuracy and timeliness of the inference task. Due to the non-uniform transmission time and inference time, we formulate this problem as a Semi-Markov Decision Process (SMDP). By converting the SMDP to an equivalent Markov decision process, we prove that the optimal control policy follows a threshold-based structure. We further develop a relative policy iteration algorithm leveraging this threshold property. Simulation results demonstrate that our proposed optimal policy significantly outperforms baseline approaches in managing the accuracy-timeliness trade-off.
Problem

Research questions and friction points this paper is trying to address.

Minimize Task-oriented Age of Information (TAoI) for remote inference
Balance accuracy and timeliness in hybrid SLM/LLM inference systems
Optimize dynamic resolution and model routing via threshold-based SMDP
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid SLM and LLM for dynamic task routing
SMDP optimization for accuracy-timeliness trade-off
Threshold-based policy for optimal inference control
🔎 Similar Papers
No similar papers found.
S
Shuying Gan
School of Electronics and Information Engineering, Sun Yat-sen University, Guangzhou, China
X
Xijun Wang
School of Electronics and Information Engineering, Sun Yat-sen University, Guangzhou, China
C
Chenyuan Feng
Department of Communication Systems, EURECOM, Sophia Antipolis, France
C
Chao Xu
School of Information Engineering, Northwest A&F University, Yangling, China
Howard H. Yang
Howard H. Yang
Assistant Professor, ZJU-UIUC Institute, Zhejiang University
Wireless NetworkingStochastic GeometryCommunication TheoryAge of InformationStatistical Machine Learning
X
Xiang Chen
School of Electronics and Information Engineering, Sun Yat-sen University, Guangzhou, China
T
Tony Q. S. Quek
Information System and Technology Design Pillar, Singapore University of Technology and Design, Singapore