Learning the Language of NVMe Streams for Ransomware Detection

📅 2025-02-07

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

This work addresses two critical challenges in NVMe-based ransomware detection: (1) the difficulty of real-time identification of ransomware behavior within NVMe command streams, and (2) the inability to accurately predict the extent of data access. To this end, we introduce, for the first time, a linguistic modeling paradigm for NVMe command sequences and propose a dual-granularity Transformer architecture—Command-Level Transformer (CLT) and Page-Level Transformer (PLT). CLT performs fine-grained classification of malicious commands via custom NVMe tokenization, context-aware token classification, and learned command embeddings; PLT estimates the volume of accessed data through sliding-window-based regression over aggregated command blocks. Evaluated on real-world NVMe traces, our approach reduces false negatives by 24%, decreases data loss by 66%, and improves localization accuracy of ransomware-accessed data by 84% over state-of-the-art tabular methods. This work establishes a novel language-modeling paradigm at the storage protocol layer and delivers a deployable, real-time hardware-aware defense framework against ransomware.

Technology Category

Application Category

📝 Abstract

We apply language modeling techniques to detect ransomware activity in NVMe command sequences. We design and train two types of transformer-based models: the Command-Level Transformer (CLT) performs in-context token classification to determine whether individual commands are initiated by ransomware, and the Patch-Level Transformer (PLT) predicts the volume of data accessed by ransomware within a patch of commands. We present both model designs and the corresponding tokenization and embedding schemes and show that they improve over state-of-the-art tabular methods by up to 24% in missed-detection rate, 66% in data loss prevention, and 84% in identifying data accessed by ransomware.

Problem

Research questions and friction points this paper is trying to address.

Detect ransomware in NVMe command sequences

Improve ransomware detection using transformer models

Enhance data loss prevention and access identification

Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformer-based ransomware detection models

Token classification for individual commands

Patch-level prediction of data access

🔎 Similar Papers

Minerva: A File-Based Ransomware Detector