How Small is Enough? Empirical Evidence of Quantized Small Language Models for Automated Program Repair

๐Ÿ“… 2025-08-22
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
To address the high computational cost and deployment challenges of large language models (LLMs) in automated program repair (APR), this work systematically evaluates the repair capability of small language models (SLMs) under resource-constrained settings. Using the QuixBugs benchmark, we empirically assess multiple state-of-the-art SLMs and apply INT8 quantization to reduce memory footprint. Results show that the best-performing SLM achieves repair accuracy comparable to mainstream LLMsโ€”up to 68.2%โ€”without quantization; after INT8 quantization, accuracy remains unchanged while GPU memory usage decreases by ~50% and inference latency drops significantly. This study is the first to demonstrate that carefully selected and lightweight-optimized SLMs can jointly achieve high efficiency and effectiveness in APR, establishing a viable new paradigm for code intelligence in edge and low-resource environments.

Technology Category

Application Category

๐Ÿ“ Abstract
Background: Large language models (LLMs) have greatly improved the accuracy of automated program repair (APR) methods. However, LLMs are constrained by high computational resource requirements. Aims: We focus on small language models (SLMs), which perform well even with limited computational resources compared to LLMs. We aim to evaluate whether SLMs can achieve competitive performance in APR tasks. Method: We conducted experiments on the QuixBugs benchmark to compare the bug-fixing accuracy of SLMs and LLMs. We also analyzed the impact of int8 quantization on APR performance. Results: The latest SLMs can fix bugs as accurately as--or even more accurately than--LLMs. Also, int8 quantization had minimal effect on APR accuracy while significantly reducing memory requirements. Conclusions: SLMs present a viable alternative to LLMs for APR, offering competitive accuracy with lower computational costs, and quantization can further enhance their efficiency without compromising effectiveness.
Problem

Research questions and friction points this paper is trying to address.

Evaluating small language models for automated program repair
Comparing bug-fixing accuracy between SLMs and LLMs
Assessing impact of quantization on APR performance and efficiency
Innovation

Methods, ideas, or system contributions that make the work stand out.

Small language models for program repair
Int8 quantization reduces memory requirements
Competitive accuracy with lower computational costs
๐Ÿ”Ž Similar Papers
No similar papers found.