Mistletoe: Stealthy Acceleration-Collapse Attacks on Speculative Decoding

📅 2026-05-13

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

This work uncovers a mechanism-level security vulnerability in model-based speculative decoding arising from distributional mismatch between the draft and target models: an adversary can significantly degrade inference acceleration while preserving output semantics. To exploit this, we propose a stealthy acceleration-collapse attack that jointly minimizes draft token acceptance rates and semantic deviation through adversarial optimization, incorporating null-space projection to constrain the target model’s output distribution. Our method demonstrates, for the first time, that the speculative decoding mechanism itself constitutes an exploitable attack surface. Evaluated across multiple systems, it substantially reduces the average acceptance length τ—thereby diminishing throughput—while maintaining consistent output quality and perplexity.

📝 Abstract

Speculative decoding has become a widely adopted technique for accelerating large language model (LLM) inference by drafting multiple candidate tokens and verifying them with a target model in parallel. Its efficiency, however, critically depends on the average accepted length $τ$, i.e., how many draft tokens survive each verification step. In this work, we identify a new mechanism-level vulnerability in model-based speculative decoding: the drafter is trained to approximate the target model distribution, but this approximation is inevitably imperfect. Such a drafter-target mismatch creates a hidden attack surface where small perturbations can preserve the target model's visible behavior while substantially reducing draft-token acceptability. We propose Mistletoe, a stealthy acceleration-collapse attack against speculative decoding. Mistletoe directly targets the acceptance mechanism of speculative decoding. It jointly optimizes a degradation objective that decreases drafter-target agreement and a semantic-preservation objective that constrains the target model's output distribution. To resolve the conflict between these objectives, we introduce a null-space projection mechanism, where degradation gradients are projected away from the local semantic-preserving direction, suppressing draft acceptance while minimizing semantic drift. Experiments on various speculative decoding systems show that Mistletoe substantially reduces average accepted length $τ$, collapses speedup, and lowers averaged token throughput, while preserving output quality and perplexity. Our work highlights that speculative decoding introduces a mechanism-level attack surface beyond existing output robustness, calling for more robust designs of LLM acceleration systems.

Problem

Research questions and friction points this paper is trying to address.

speculative decoding

acceleration-collapse attack

drafter-target mismatch

token acceptance

LLM inference

Innovation

Methods, ideas, or system contributions that make the work stand out.

speculative decoding

acceleration-collapse attack

drafter-target mismatch