LLM Security and Safety: Insights from Homotopy-Inspired Prompt Obfuscation

📅 2026-01-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the vulnerability of large language models (LLMs) to carefully crafted prompt attacks, which expose critical gaps in existing safety mechanisms. For the first time, homotopy theory is introduced into prompt engineering to develop a systematic prompt obfuscation framework. By designing and injecting specialized prompts, the approach reveals anomalous behaviors and latent vulnerabilities in LLMs during tasks such as code generation. Extensive experiments across prominent models—including LLaMA, Deepseek, KIMI, and Claude—comprising 15,732 prompts (with 10,000 high-priority cases)—demonstrate the efficacy of this method. The study establishes a novel paradigm and provides theoretical grounding for building more reliable, detectable, and resilient AI systems.

Technology Category

Application Category

📝 Abstract
In this study, we propose a homotopy-inspired prompt obfuscation framework to enhance understanding of security and safety vulnerabilities in Large Language Models (LLMs). By systematically applying carefully engineered prompts, we demonstrate how latent model behaviors can be influenced in unexpected ways. Our experiments encompassed 15,732 prompts, including 10,000 high-priority cases, across LLama, Deepseek, KIMI for code generation, and Claude to verify. The results reveal critical insights into current LLM safeguards, highlighting the need for more robust defense mechanisms, reliable detection strategies, and improved resilience. Importantly, this work provides a principled framework for analyzing and mitigating potential weaknesses, with the goal of advancing safe, responsible, and trustworthy AI technologies.
Problem

Research questions and friction points this paper is trying to address.

LLM security
prompt obfuscation
safety vulnerabilities
adversarial prompts
AI trustworthiness
Innovation

Methods, ideas, or system contributions that make the work stand out.

homotopy-inspired
prompt obfuscation
LLM security
safety vulnerabilities
adversarial probing
🔎 Similar Papers
No similar papers found.
L
Luis Lazo
Canadian Institute for Cybersecurity, Faculty of Computer Science, University of New Brunswick
Hamed Jelodar
Hamed Jelodar
CIC | UNB, DAL, NJUST
AI | Machine LearningNatural Language ProcessingDigital Mental HealthCyberNLPTopic Modeling
R
R. Razavi-Far
Canadian Institute for Cybersecurity, Faculty of Computer Science, University of New Brunswick