G-Boost: Boosting Private SLMs with General LLMs

📅 2025-03-13

📈 Citations: 0

✨ Influential: 0

career value

190K/year

🤖 AI Summary

To address the performance bottlenecks of private small language models (SLMs) in resource-constrained settings, this paper proposes a process-reward-driven dynamic collaborative inference framework. The method avoids fine-tuning general-purpose large language models (LLMs) or exposing private data, instead enabling lightweight, controllable joint inference between SLMs and LLMs via four key components: (i) modeling of step-wise process rewards, (ii) adaptive collaborative decoding, (iii) API-aware scheduling, and (iv) heterogeneous architecture design. Evaluated across multiple benchmark tasks, the private SLM achieves substantial performance gains—matching or even surpassing the standalone performance of general-purpose LLMs—while reducing inference cost by over 30%. The core contribution lies in the first introduction of a process reward mechanism into SLM–LLM collaborative inference, uniquely balancing efficiency, privacy preservation, and inference controllability.

Technology Category

Application Category

📝 Abstract

Due to the limited computational resources, most Large Language Models (LLMs) developers can only fine-tune Small Language Models (SLMs) on their own data. These private SLMs typically have limited effectiveness. To boost the performance of private SLMs, this paper proposes to ask general LLMs for help. The general LLMs can be APIs or larger LLMs whose inference cost the developers can afford. Specifically, we propose the G-Boost framework where a private SLM adaptively performs collaborative inference with a general LLM under the guide of process reward. Experiments demonstrate that our framework can significantly boost the performance of private SLMs.

Problem

Research questions and friction points this paper is trying to address.

Enhance private Small Language Models (SLMs) performance.

Collaborate private SLMs with general Large Language Models (LLMs).

Use process reward to guide adaptive collaborative inference.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Collaborative inference between private and general LLMs

Adaptive performance enhancement using process reward

Cost-effective use of general LLMs for SLM boosting

🔎 Similar Papers

No similar papers found.