PhySE: A Psychological Framework for Real-Time AR-LLM Social Engineering Attacks

๐Ÿ“… 2026-04-25
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF

career value

234K/year
๐Ÿค– AI Summary
This work addresses the challenges of cold-start latency in user personalization and the lack of psychological grounding in static strategies within AR-LLMโ€“based social engineering attacks. To overcome these limitations, the authors propose a real-time adaptive attack framework that leverages a vision-language model (VLM)โ€“driven social context pretraining mechanism to rapidly construct user profiles, thereby eliminating cold-start delays. The framework further integrates retrieval-augmented generation (RAG) with a dynamic strategy module grounded in established psychological theories, enabling feedback-driven conversational steering. Evaluated in an IRB-approved study involving 60 participants, the system demonstrates significant improvements in both interaction efficiency and attack success rates. As an additional contribution, the authors introduce a novel dataset comprising 360 annotated multi-scenario dialogues.

Technology Category

Application Category

๐Ÿ“ Abstract
The emerging threat of AR-LLM-based Social Engineering (AR-LLM-SE) attacks (e.g. SEAR) poses a significant risk to real-world social interactions. In such an attack, a malicious actor uses Augmented Reality (AR) glasses to capture a target visual and vocal data. A Large Language Model (LLM) then analyzes this data to identify the individual and generate a detailed social profile. Subsequently, LLM-powered agents employ social engineering strategies, providing real-time conversation suggestions, to gain the target trust and ultimately execute phishing or other malicious acts. Despite its potential, the practical application of AR-LLM-SE faces two major bottlenecks, (1) Cold-start personalization, Current Retrieval-Augmented Generation (RAG) methods introduce critical delays in the earliest turns, slowing initial profile formation and disrupting real-time interaction, (2) Static Attack Strategies, Existing approaches rely on fixed-stage, handcrafted social engineering tactics that lack foundation in established psychological theory. To address these limitations, we propose PhySE, a novel framework with two core innovations, (1) VLM-Based SocialContext Training, To eliminate profiling delays, we efficiently pre-train a Visual Language Model (VLM) with social-context data, enabling rapid, on-the-fly profile generation, (2) Adaptive Psychological Agent, We introduce a psychological LLM that dynamically deploys distinct classes of psychological strategies based on target response, moving beyond static, handcrafted scripts. We evaluated PhySE through an IRB-approved user study with 60 participants, collecting a novel dataset of 360 annotated conversations across diverse social scenarios.
Problem

Research questions and friction points this paper is trying to address.

AR-LLM-SE
Cold-start personalization
Static Attack Strategies
Social Engineering
Real-time Interaction
Innovation

Methods, ideas, or system contributions that make the work stand out.

Visual Language Model
Psychological Framework
Real-Time Social Engineering
Adaptive Strategy
AR-LLM
๐Ÿ”Ž Similar Papers