Style Attack Disguise: When Fonts Become a Camouflage for Adversarial Intent

๐Ÿ“… 2025-10-22
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work uncovers a fundamental discrepancy between human readability and model perception in stylized fontsโ€”a previously unrecognized adversarial attack surface in NLP. To exploit this, we propose SAD (Stylized-Adversarial Deception), the first font-style-based adversarial attack framework. SAD operates via character-level stylistic substitution and token-mapping discrepancy analysis, enabling effective, stealthy, and query-efficient attacks against traditional models, large language models (LLMs), and commercial APIsโ€”while preserving semantic integrity and human readability. It supports two operational modes: lightweight and strong-attack. Empirical evaluation across sentiment analysis, machine translation, and multimodal generation demonstrates high attack success rates and substantial degradation in system performance. Crucially, SAD is the first systematic investigation to expose severe font-robustness vulnerabilities in state-of-the-art NLP and multimodal models.

Technology Category

Application Category

๐Ÿ“ Abstract
With social media growth, users employ stylistic fonts and font-like emoji to express individuality, creating visually appealing text that remains human-readable. However, these fonts introduce hidden vulnerabilities in NLP models: while humans easily read stylistic text, models process these characters as distinct tokens, causing interference. We identify this human-model perception gap and propose a style-based attack, Style Attack Disguise (SAD). We design two sizes: light for query efficiency and strong for superior attack performance. Experiments on sentiment classification and machine translation across traditional models, LLMs, and commercial services demonstrate SAD's strong attack performance. We also show SAD's potential threats to multimodal tasks including text-to-image and text-to-speech generation.
Problem

Research questions and friction points this paper is trying to address.

Exploits font-model perception gap for adversarial attacks
Targets NLP systems via stylized text vulnerabilities
Threatens multimodal applications like text-to-image generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Style-based attack exploits font-model perception gap
Light and strong attack designs balance efficiency and performance
Multimodal threat demonstrated across text-image-speech tasks
Y
Yangshijie Zhang
Lanzhou University
Xinda Wang
Xinda Wang
University of Texas at Dallas
Software SecurityAI SecuritySystems Security
J
Jialin Liu
Peking University
W
Wenqiang Wang
Sun Yat-sen University
Z
Zhicong Ma
Lanzhou University
X
Xingxing Jia
Lanzhou University