MARGIN: Margin-Aware Regularized Geometry for Imbalanced Vulnerability Detection

πŸ“… 2026-05-11
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF

career value

222K/year
πŸ€– AI Summary
This work addresses the pervasive issue of frequency and difficulty imbalances in real-world vulnerability detection data, which distort the geometry of embedding spaces. It proposes a unified framework that models both types of imbalance within a hyperspherical embedding geometry, introducing a dynamic geometric regularization mechanism based on the concentration parameter of the von Mises–Fisher distribution. By integrating adaptive margin metric learning with hyperspherical prototype modeling, the method aligns the probability mass of the embedding distribution with its corresponding Voronoi cells, thereby mitigating representation distortion and stabilizing decision boundaries. Experimental results demonstrate that the approach significantly outperforms strong baselines across multiple public vulnerability datasets, particularly excelling under severe imbalance conditions, and yields embeddings with enhanced discriminability, interpretability, and generalization capability.
πŸ“ Abstract
Software vulnerability detection is critical for ensuring software security and reliability. Despite recent advances in deep learning, real-world vulnerability datasets suffer from two severe challenges: frequency imbalance and difficulty imbalance. We reinterpret these challenges from an embedding geometry perspective, observing that such imbalances induce geometric distortions in hyperspherical representation space. To address this issue, we propose MARGIN, a metric-based framework that learns discriminative vulnerability representations through adaptive margin metric learning and hyperspherical prototype modeling. MARGIN dynamically adjusts geometric regularization according to the distribution structure estimated by the von Mises-Fisher concentration, aligning the probability mass of embedding distributions with their corresponding Voronoi cells, thereby reducing geometric distortion and yielding more stable decision boundaries. Extensive experiments on public vulnerability datasets show that MARGIN consistently outperforms strong baselines, achieving notable improvements in classification and detection, especially on challenging, imbalanced datasets. Further analysis demonstrates that MARGIN produces more structured embedding geometries, improving robustness, interpretability, and generalization.
Problem

Research questions and friction points this paper is trying to address.

frequency imbalance
difficulty imbalance
geometric distortion
vulnerability detection
imbalanced datasets
Innovation

Methods, ideas, or system contributions that make the work stand out.

adaptive margin metric learning
hyperspherical prototype modeling
geometric regularization
embedding geometry
imbalanced vulnerability detection
πŸ”Ž Similar Papers
No similar papers found.
Y
Yuteng Zhang
College of Artificial Intelligence and Computer Science, Northwest Normal University, Lanzhou 730070, China
H
Huifang Ma
College of Artificial Intelligence and Computer Science, Northwest Normal University, Lanzhou 730070, China
J
Jiahui Wei
College of Artificial Intelligence and Computer Science, Northwest Normal University, Lanzhou 730070, China
Qingqing Li
Qingqing Li
Researcher, University of Turku
Sensor FusionRoboticsOdometrySLAMLidars
Y
Yafei Yang
College of Artificial Intelligence and Computer Science, Northwest Normal University, Lanzhou 730070, China