MARGIN: Margin-Aware Regularized Geometry for Imbalanced Vulnerability Detection

📅 2026-05-11

📈 Citations: 0

✨ Influential: 0

career value

184K/year

🤖 AI Summary

This work addresses the pervasive issue of frequency and difficulty imbalances in real-world vulnerability detection data, which distort the geometry of embedding spaces. It proposes a unified framework that models both types of imbalance within a hyperspherical embedding geometry, introducing a dynamic geometric regularization mechanism based on the concentration parameter of the von Mises–Fisher distribution. By integrating adaptive margin metric learning with hyperspherical prototype modeling, the method aligns the probability mass of the embedding distribution with its corresponding Voronoi cells, thereby mitigating representation distortion and stabilizing decision boundaries. Experimental results demonstrate that the approach significantly outperforms strong baselines across multiple public vulnerability datasets, particularly excelling under severe imbalance conditions, and yields embeddings with enhanced discriminability, interpretability, and generalization capability.

📝 Abstract

Software vulnerability detection is critical for ensuring software security and reliability. Despite recent advances in deep learning, real-world vulnerability datasets suffer from two severe challenges: frequency imbalance and difficulty imbalance. We reinterpret these challenges from an embedding geometry perspective, observing that such imbalances induce geometric distortions in hyperspherical representation space. To address this issue, we propose MARGIN, a metric-based framework that learns discriminative vulnerability representations through adaptive margin metric learning and hyperspherical prototype modeling. MARGIN dynamically adjusts geometric regularization according to the distribution structure estimated by the von Mises-Fisher concentration, aligning the probability mass of embedding distributions with their corresponding Voronoi cells, thereby reducing geometric distortion and yielding more stable decision boundaries. Extensive experiments on public vulnerability datasets show that MARGIN consistently outperforms strong baselines, achieving notable improvements in classification and detection, especially on challenging, imbalanced datasets. Further analysis demonstrates that MARGIN produces more structured embedding geometries, improving robustness, interpretability, and generalization.

Problem

Research questions and friction points this paper is trying to address.

frequency imbalance

difficulty imbalance

geometric distortion

vulnerability detection

imbalanced datasets

Innovation

Methods, ideas, or system contributions that make the work stand out.

adaptive margin metric learning

hyperspherical prototype modeling

geometric regularization

embedding geometry

imbalanced vulnerability detection

🔎 Similar Papers

No similar papers found.

💼 Related Jobs

Software Engineer, Machine Learning