🤖 AI Summary
This work addresses the limitations of existing vulnerability detection methods, which predominantly rely on a single code modality and overlook developer intent embedded in code comments, thereby constraining generalization in complex logical scenarios. To overcome this, the authors propose MultiVul, a novel framework that jointly models source code and natural language comments as complementary modalities for the first time. MultiVul enhances representation robustness through multimodal contrastive learning, dual-similarity alignment, and consistency regularization, and is fine-tuned on multiple large language models, including DeepSeek-Coder and Qwen2.5-Coder. Evaluated on the DiverseVul and Devign benchmarks, MultiVul achieves up to a 27.07% F1-score improvement over prompt engineering baselines and a 13.37% gain over code-only fine-tuning, while maintaining comparable inference efficiency—demonstrating a significant breakthrough beyond the performance ceiling of unimodal approaches.
📝 Abstract
Source code and its accompanying comments are complementary yet naturally aligned modalities-code encodes structural logic while comments capture developer intent. However, existing vulnerability detection methods mostly rely on single-modality code representations, overlooking the complementary semantic information embedded in comments and thus limiting their generalization across complex code structures and logical relationships. To address this, we propose MultiVul, a multimodal contrastive framework that aligns code and comment representations through dual similarity learning and consistency regularization, augmented with diverse code-text pairs to improve robustness. Experiments on widely adopted DiverseVul and Devign datasets across four large language models (LLMs) (i.e., DeepSeek-Coder-6.7B, Qwen2.5-Coder-7B, StarCoder2-7B, and CodeLlama-7B) show that MultiVul achieves up to 27.07% F1 improvement over prompting-based methods and 13.37% over code-only Fine-Tuning, while maintaining comparable inference efficiency.