ZeroDVFS: Zero-Shot LLM-Guided Core and Frequency Allocation for Embedded Platforms

📅 2026-01-13

📈 Citations: 2

✨ Influential: 0

career value

239K/year

🤖 AI Summary

This work addresses the limitations of traditional dynamic voltage and frequency scaling (DVFS) and task-core allocation methods, which rely on heuristics or offline profiling, struggle to generalize to unseen workloads, and neglect stall time—leading to suboptimal energy efficiency and thermal management. To overcome these challenges, the paper proposes the first large language model (LLM)-guided, zero-shot multi-agent reinforcement learning framework for runtime scheduling. The approach leverages an LLM to extract 13-dimensional code-level semantic features from OpenMP programs and integrates hierarchical multi-agent action decomposition, regression-based environment modeling, and a Dyna-Q architecture to enable workload-agnostic scheduling without prior profiling. Experiments on Jetson TX2/Orin NX, RubikPi, and Intel Core i7 platforms demonstrate a 7.09× improvement in energy efficiency and a 4× reduction in task completion time compared to the Linux ondemand scheduler, with the first scheduling decision made 8,300× faster than conventional tabular methods.

Technology Category

Application Category

📝 Abstract

Dynamic voltage and frequency scaling (DVFS) and task-to-core allocation are critical for thermal management and balancing energy and performance in embedded systems. Existing approaches either rely on utilization-based heuristics that overlook stall times, or require extensive offline profiling for table generation, preventing runtime adaptation. We propose a model-based hierarchical multi-agent reinforcement learning (MARL) framework for thermal- and energy-aware scheduling on multi-core platforms. Two collaborative agents decompose the exponential action space, achieving 358ms latency for subsequent decisions. First decisions require 3.5 to 8.0s including one-time LLM feature extraction. An accurate environment model leverages regression techniques to predict thermal dynamics and performance states. When combined with LLM-extracted semantic features, the environment model enables zero-shot deployment for new workloads on trained platforms by generating synthetic training data without requiring workload-specific profiling samples. We introduce LLM-based semantic feature extraction that characterizes OpenMP programs through 13 code-level features without execution. The Dyna-Q-inspired framework integrates direct reinforcement learning with model-based planning, achieving 20x faster convergence than model-free methods. Experiments on BOTS and PolybenchC benchmarks across NVIDIA Jetson TX2, Jetson Orin NX, RubikPi, and Intel Core i7 demonstrate 7.09x better energy efficiency and 4.0x better makespan than Linux ondemand governor. First-decision latency is 8,300x faster than table-based profiling, enabling practical deployment in dynamic embedded systems.

Problem

Research questions and friction points this paper is trying to address.

DVFS

task-to-core allocation

zero-shot deployment

embedded systems

thermal management

Innovation

Methods, ideas, or system contributions that make the work stand out.

Zero-shot learning

LLM-guided scheduling

Multi-agent reinforcement learning