🤖 AI Summary
This work uncovers a novel acoustic-layer security threat to large language model (LLM)-enhanced automatic speech recognition (ASR) systems: a stealthy backdoor attack grounded in financial time-series modeling. To this end, we propose MarketBackFinal 2.0—the first backdoor framework integrating stock market dynamics modeling and Bayesian optimization into acoustic data poisoning. Its key contributions are: (1) generating semantically neutral, physically realizable audio triggers from real-world stock price fluctuations; (2) achieving task-agnostic, cross-model generalization and robustness against common audio perturbations; and (3) attaining >92% attack success rate on ASR tasks using only 0.3% poisoned training samples. This work establishes the first interdisciplinary paradigm bridging financial modeling and speech security, introducing a new benchmark for evaluating the robustness of LLM-augmented speech systems.
📝 Abstract
Since the advent of generative artificial intelligence, every company and researcher has been rushing to develop their own generative models, whether commercial or not. Given the large number of users of these powerful new tools, there is currently no intrinsically verifiable way to explain from the ground up what happens when LLMs (large language models) learn. For example, those based on automatic speech recognition systems, which have to rely on huge and astronomical amounts of data collected from all over the web to produce fast and efficient results, In this article, we develop a backdoor attack called MarketBackFinal 2.0, based on acoustic data poisoning, MarketBackFinal 2.0 is mainly based on modern stock market models. In order to show the possible vulnerabilities of speech-based transformers that may rely on LLMs.