🤖 AI Summary
Large language models (LLMs) exhibit poor generalization and unstable performance across diverse vulnerability types and smart contract structures in vulnerability detection.
Method: This paper proposes LLMBugScanner, a novel detection framework integrating domain-knowledge adaptation with consensus-driven ensemble inference. It jointly optimizes multiple LLMs on heterogeneous vulnerability datasets via parameter-efficient fine-tuning (PEFT) and complementary multi-model training, augmented by a conflict-resolution mechanism that leverages instruction-guided vulnerability reasoning and ensemble consensus decision-making to enhance detection consistency.
Contribution/Results: Evaluated on mainstream LLMs—including Llama-3, Qwen, and DeepSeek—LLMBugScanner significantly improves accuracy and cross-vulnerability generalization, achieving an average F1-score gain of 12.7% over both single-model fine-tuning and existing state-of-the-art methods.
📝 Abstract
This paper presents LLMBugScanner, a large language model (LLM) based framework for smart contract vulnerability detection using fine-tuning and ensemble learning. Smart contract auditing presents several challenges for LLMs: different pretrained models exhibit varying reasoning abilities, and no single model performs consistently well across all vulnerability types or contract structures. These limitations persist even after fine-tuning individual LLMs. To address these challenges, LLMBugScanner combines domain knowledge adaptation with ensemble reasoning to improve robustness and generalization. Through domain knowledge adaptation, we fine-tune LLMs on complementary datasets to capture both general code semantics and instruction-guided vulnerability reasoning, using parameter-efficient tuning to reduce computational cost. Through ensemble reasoning, we leverage the complementary strengths of multiple LLMs and apply a consensus-based conflict resolution strategy to produce more reliable vulnerability assessments. We conduct extensive experiments across multiple popular LLMs and compare LLMBugScanner with both pretrained and fine-tuned individual models. Results show that LLMBugScanner achieves consistent accuracy improvements and stronger generalization, demonstrating that it provides a principled, cost-effective, and extensible framework for smart contract auditing.