TrojanLoC: LLM-based Framework for RTL Trojan Localization

📅 2025-11-29

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

To address semantic loss, limited graph neural network (GNN) receptive fields, and coarse-grained localization in RTL-level hardware Trojan (HT) detection, this paper proposes the first fine-grained HT detection framework based on RTL-finetuned large language models (LLMs). Our method directly extracts module- and line-level semantic features from RTL source code and integrates them with dataflow graphs to preserve both global context and local structural information. We introduce TrojanInS, a large-scale, synthetically generated dataset with fine-grained annotations, enabling multi-class, effect-oriented HT detection. Experiments demonstrate state-of-the-art performance: 0.99 F1-score for module-level detection (up to +0.68 over baselines), 0.84 macro-F1 for HT type classification, and 0.93 macro-F1 for line-level localization—significantly enhancing precise HT identification. This work pioneers the adaptation of LLMs to RTL for hardware security analysis, establishing a new paradigm for semantic-aware, multi-granularity, and high-accuracy HT detection.

Technology Category

Application Category

📝 Abstract

Hardware Trojans (HT s) are a persistent threat to integrated circuits, especially when inserted at the register-transfer level (RTL). Existing methods typically first convert the design into a graph, such as a gate-level netlist or an RTL-derived dataflow graph (DFG), and then use a graph neural network (GNN ) to obtain an embedding of that graph, which (i) loses compact RTL semantics, (ii) relies on shallow GNNs with limited receptive field, and (iii) is largely restricted to coarse, module-level binary HT detection. We propose TrojanLoC, an LLM-based framework for RTL-level HT localization. We use an RTL-finetuned LLM to derive module-level and line-level embeddings directly from RTL code, capturing both global design context and local semantics. Next, we train task-specific classifiers on these embeddings to perform module-level Trojan detection, type prediction, and fine-grained line-level localization. We also introduce TrojanInS, a large synthetic dataset of RTL designs with systematically injected Trojans from four effect-based categories, each accompanied by precise line-level annotations. Our experiments show that TrojanLoC achieves strong module-level performance, reaching 0.99 F1-score for Trojan detection, up to 0.68 higher than baseline, and 0.84 macro-F1 for Trojan-type classification. At the line level, TrojanLoc further achieves up to 0.93 macro-F1, enabling fine-grained localization of Trojan-relevant RTL lines

Problem

Research questions and friction points this paper is trying to address.

Detects Hardware Trojans at RTL level using LLM embeddings

Localizes Trojan lines precisely beyond coarse module detection

Addresses semantic loss and limited receptive field in GNN methods

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses RTL-finetuned LLM for module and line embeddings

Trains classifiers on embeddings for detection and localization

Introduces synthetic dataset with systematic Trojan injections

🔎 Similar Papers

No similar papers found.