TrojanLoC: LLM-based Framework for RTL Trojan Localization

📅 2025-11-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address semantic loss, limited graph neural network (GNN) receptive fields, and coarse-grained localization in RTL-level hardware Trojan (HT) detection, this paper proposes the first fine-grained HT detection framework based on RTL-finetuned large language models (LLMs). Our method directly extracts module- and line-level semantic features from RTL source code and integrates them with dataflow graphs to preserve both global context and local structural information. We introduce TrojanInS, a large-scale, synthetically generated dataset with fine-grained annotations, enabling multi-class, effect-oriented HT detection. Experiments demonstrate state-of-the-art performance: 0.99 F1-score for module-level detection (up to +0.68 over baselines), 0.84 macro-F1 for HT type classification, and 0.93 macro-F1 for line-level localization—significantly enhancing precise HT identification. This work pioneers the adaptation of LLMs to RTL for hardware security analysis, establishing a new paradigm for semantic-aware, multi-granularity, and high-accuracy HT detection.

Technology Category

Application Category

📝 Abstract
Hardware Trojans (HT s) are a persistent threat to integrated circuits, especially when inserted at the register-transfer level (RTL). Existing methods typically first convert the design into a graph, such as a gate-level netlist or an RTL-derived dataflow graph (DFG), and then use a graph neural network (GNN ) to obtain an embedding of that graph, which (i) loses compact RTL semantics, (ii) relies on shallow GNNs with limited receptive field, and (iii) is largely restricted to coarse, module-level binary HT detection. We propose TrojanLoC, an LLM-based framework for RTL-level HT localization. We use an RTL-finetuned LLM to derive module-level and line-level embeddings directly from RTL code, capturing both global design context and local semantics. Next, we train task-specific classifiers on these embeddings to perform module-level Trojan detection, type prediction, and fine-grained line-level localization. We also introduce TrojanInS, a large synthetic dataset of RTL designs with systematically injected Trojans from four effect-based categories, each accompanied by precise line-level annotations. Our experiments show that TrojanLoC achieves strong module-level performance, reaching 0.99 F1-score for Trojan detection, up to 0.68 higher than baseline, and 0.84 macro-F1 for Trojan-type classification. At the line level, TrojanLoc further achieves up to 0.93 macro-F1, enabling fine-grained localization of Trojan-relevant RTL lines
Problem

Research questions and friction points this paper is trying to address.

Detects Hardware Trojans at RTL level using LLM embeddings
Localizes Trojan lines precisely beyond coarse module detection
Addresses semantic loss and limited receptive field in GNN methods
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses RTL-finetuned LLM for module and line embeddings
Trains classifiers on embeddings for detection and localization
Introduces synthetic dataset with systematic Trojan injections
🔎 Similar Papers
No similar papers found.