Turn-Based Structural Triggers: Prompt-Free Backdoors in Multi-Turn LLMs

📅 2026-01-20

📈 Citations: 0

✨ Influential: 0

career value

181K/year

🤖 AI Summary

This work addresses a critical security gap in existing backdoor attacks against large language models, which predominantly rely on user-visible prompt-based triggers and overlook the risks posed by structural signals in multi-turn dialogues. The study proposes, for the first time, using dialogue turn indices as implicit, structured triggers to activate backdoors without requiring any user input, thereby circumventing the limitations of conventional prompt-dependent attacks. Through fine-tuning-based backdoor injection and evaluation in multi-turn interaction scenarios, the method achieves an average attack success rate of 99.52% across four mainstream open-source large language models. It maintains robust performance under five representative defenses, with a success rate of 98.04%, and demonstrates strong generalization, revealing dialogue structure as a novel and potent attack surface.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) are widely integrated into interactive systems such as dialogue agents and task-oriented assistants. This growing ecosystem also raises supply-chain risks, where adversaries can distribute poisoned models that degrade downstream reliability and user trust. Existing backdoor attacks and defenses are largely prompt-centric, focusing on user-visible triggers while overlooking structural signals in multi-turn conversations. We propose Turn-based Structural Trigger (TST), a backdoor attack that activates from dialogue structure, using the turn index as the trigger and remaining independent of user inputs. Across four widely used open-source LLM models, TST achieves an average attack success rate (ASR) of 99.52% with minimal utility degradation, and remains effective under five representative defenses with an average ASR of 98.04%. The attack also generalizes well across instruction datasets, maintaining an average ASR of 99.19%. Our results suggest that dialogue structure constitutes an important and under-studied attack surface for multi-turn LLM systems, motivating structure-aware auditing and mitigation in practice.

Problem

Research questions and friction points this paper is trying to address.

backdoor attacks

multi-turn LLMs

dialogue structure

structural triggers

supply-chain risks

Innovation

Methods, ideas, or system contributions that make the work stand out.

backdoor attack

turn-based trigger

structural trigger