SMAB: MAB based word Sensitivity Estimation Framework and its Applications in Adversarial Text Generation

📅 2025-02-10

📈 Citations: 0

✨ Influential: 0

career value

190K/year

🤖 AI Summary

To address the scalability challenge in word-level sensitivity estimation for sequence classification—where exponential computational complexity hinders practical application—this paper proposes SMAB, the first framework to formulate text sensitivity modeling as a multi-armed bandit (MAB) problem. SMAB employs adversarial perturbation guidance and sensitivity-weighted rewards to efficiently estimate both local and global word sensitivities without gold-standard supervision. Theoretical analysis and empirical evaluation demonstrate that SMAB reduces computational complexity from exponential to near-linear time while yielding sensitivity estimates strongly correlated with model accuracy—enabling their use as unsupervised performance proxies. In cross-lingual text classification and generation tasks, SMAB-powered adversarial attacks achieve a 15.58% higher success rate, and SMAB-guided adversarial rewriting outperforms state-of-the-art methods by 12.00%.

Technology Category

Application Category

📝 Abstract

To understand the complexity of sequence classification tasks, Hahn et al. (2021) proposed sensitivity as the number of disjoint subsets of the input sequence that can each be individually changed to change the output. Though effective, calculating sensitivity at scale using this framework is costly because of exponential time complexity. Therefore, we introduce a Sensitivity-based Multi-Armed Bandit framework (SMAB), which provides a scalable approach for calculating word-level local (sentence-level) and global (aggregated) sensitivities concerning an underlying text classifier for any dataset. We establish the effectiveness of our approach through various applications. We perform a case study on CHECKLIST generated sentiment analysis dataset where we show that our algorithm indeed captures intuitively high and low-sensitive words. Through experiments on multiple tasks and languages, we show that sensitivity can serve as a proxy for accuracy in the absence of gold data. Lastly, we show that guiding perturbation prompts using sensitivity values in adversarial example generation improves attack success rate by 15.58%, whereas using sensitivity as an additional reward in adversarial paraphrase generation gives a 12.00% improvement over SOTA approaches. Warning: Contains potentially offensive content.

Problem

Research questions and friction points this paper is trying to address.

Scalable word sensitivity estimation

Improving adversarial text generation

Sensitivity as accuracy proxy

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-Armed Bandit framework

Word-level sensitivity estimation

Adversarial text generation enhancement

🔎 Similar Papers

No similar papers found.

Bosch Group

Hildesheim, NDS, DE

PhD GenAI Research Scientist Intern

Databricks

SF Bay Area Hourly Rate$54—$60 USD

San Francisco, CA, USA

Authors to Follow