🤖 AI Summary
This work identifies a novel, large-scale identity impersonation threat enabled by efficient personalized text generation: malicious actors can low-costly clone an individual’s writing style on open-source LLMs using parameter-efficient fine-tuning (PEFT) methods—e.g., LoRA or QLoRA—using only a small set of publicly available texts, thereby enabling targeted phishing and other social engineering attacks. Unlike image or audio deepfakes, this textual impersonation exhibits high stealthiness, significant detection challenges, and a critical absence of dedicated defenses. The study provides the first systematic demonstration of its technical feasibility and security implications, revealing that mainstream LLMs and existing AI safety frameworks lack mechanisms to mitigate text-level identity forgery. We advocate for the development of specialized detection techniques, data governance policies, and model release standards to address this “textual deepfake” gap, urging the research and industrial communities to prioritize identity integrity in generative text applications.
📝 Abstract
The recent surge in high-quality open-sourced Generative AI text models (colloquially: LLMs), as well as efficient finetuning techniques, has opened the possibility of creating high-quality personalized models, i.e., models generating text attuned to a specific individual's needs and capable of credibly imitating their writing style by leveraging that person's own data to refine an open-source model. The technology to create such models is accessible to private individuals, and training and running such models can be done cheaply on consumer-grade hardware. These advancements are a huge gain for usability and privacy. This position paper argues, however, that these advancements also introduce new safety risks by making it practically feasible for malicious actors to impersonate specific individuals at scale, for instance for the purpose of phishing emails, based on small amounts of publicly available text. We further argue that these risks are complementary to - and distinct from - the much-discussed risks of other impersonation attacks such as image, voice, or video deepfakes, and are not adequately addressed by the larger research community, or the current generation of open - and closed-source models.