GPIoT: Tailoring Small Language Models for IoT Program Synthesis and Development

📅 2025-03-02

📈 Citations: 0

✨ Influential: 0

career value

215K/year

🤖 AI Summary

To address core challenges in IoT-oriented LLM-based code generation—including privacy leakage, network dependency, high latency, expensive API queries, and insufficient output accuracy—this paper proposes a lightweight on-device code generation framework. Methodologically: (1) we introduce the first IoT-specific fine-tuning paradigm and benchmark, IoTBench; (2) we efficiently fine-tune small language models (SLMs) on an IoT program dataset; and (3) we integrate lightweight retrieval-augmented generation (RAG) with on-device inference optimization. Experimental results demonstrate that our system achieves a 64.7% average task accuracy improvement on IoTBench, significantly outperforming cloud-based LLMs such as GPT-4. Moreover, it delivers low-latency inference, strong local privacy preservation, and high network robustness—leading to substantially enhanced user satisfaction.

Technology Category

Application Category

📝 Abstract

Code Large Language Models (LLMs) enhance software development efficiency by automatically generating code and documentation in response to user requirements. However, code LLMs cannot synthesize specialized programs when tasked with IoT applications that require domain knowledge. While Retrieval-Augmented Generation (RAG) offers a promising solution by fetching relevant domain knowledge, it necessitates powerful cloud LLMs (e.g., GPT-4) to process user requirements and retrieved contents, which raises significant privacy concerns. This approach also suffers from unstable networks and prohibitive LLM query costs. Moreover, it is challenging to ensure the correctness and relevance of the fetched contents. To address these issues, we propose GPIoT, a code generation system for IoT applications by fine-tuning locally deployable Small Language Models (SLMs) on IoT-specialized datasets. SLMs have smaller model sizes, allowing efficient local deployment and execution to mitigate privacy concerns and network uncertainty. Furthermore, by fine-tuning the SLMs with our IoT-specialized datasets, the SLMs' ability to synthesize IoT-related programs can be substantially improved. To evaluate GPIoT's capability in synthesizing programs for IoT applications, we develop a benchmark, IoTBench. Extensive experiments and user trials demonstrate the effectiveness of GPIoT in generating IoT-specialized code, outperforming state-of-the-art code LLMs with an average task accuracy increment of 64.7% and significant improvements in user satisfaction.

Problem

Research questions and friction points this paper is trying to address.

Addressing IoT program synthesis challenges with domain-specific knowledge

Mitigating privacy and network issues in IoT code generation

Enhancing IoT code accuracy and relevance using fine-tuned SLMs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-tunes Small Language Models for IoT

Uses IoT-specialized datasets for training

Develops IoTBench for performance evaluation

🔎 Similar Papers

No similar papers found.