๐ค AI Summary
NVD vulnerability descriptions are often terse, outdated, and information-deficient, hindering threat intelligence sharing. To address this, we propose Zad, the first dual-pipeline system that jointly leverages a bi-encoderโdriven multi-source retrieval module and domain-adapted generative models (BART/LLaMA) for automated vulnerability description enhancement and dynamic updating. Our approach balances informational breadth with semantic coherence, enabling cross-source vulnerability data fusion. Evaluation employs a joint ROUGE/BERTScore metric alongside human assessment. Experiments demonstrate that enhanced descriptions increase in length by 217%, while ROUGE-L and BERTScore improve by 18.3% and 22.6%, respectively; human evaluation yields an 89.4% acceptance rate. Zad significantly improves description completeness, timeliness, and operational utility for downstream security applications.
๐ Abstract
Public vulnerability databases, such as the National Vulnerability Database (NVD), document vulnerabilities and facilitate threat information sharing. However, they often suffer from short descriptions and outdated or insufficient information. In this paper, we introduce Zad, a system designed to enrich NVD vulnerability descriptions by leveraging external resources. Zad consists of two pipelines: one collects and filters supplementary data using two encoders to build a detailed dataset, while the other fine-tunes a pre-trained model on this dataset to generate enriched descriptions. By addressing brevity and improving content quality, Zad produces more comprehensive and cohesive vulnerability descriptions. We evaluate Zad using standard summarization metrics and human assessments, demonstrating its effectiveness in enhancing vulnerability information.