ROBAD: Robust Adversary-aware Local-Global Attended Bad Actor Detection Sequential Model

📅 2025-07-20

📈 Citations: 0

✨ Influential: 0

career value

237K/year

🤖 AI Summary

Existing malicious user detection models exhibit insufficient robustness against adversarial attacks, being highly sensitive to minor perturbations in input sequences. To address this, we propose a novel detection framework integrating local–global modeling with adversarial robust learning. Specifically, a Transformer encoder extracts post-level local features, while a decoder captures global temporal patterns across users’ posting sequences. We introduce, for the first time, a synergistic combination of local–global attention mechanisms and adversarial behavior simulation, further enhanced by contrastive learning to strengthen discriminative capability at the classification layer. Extensive experiments on the Yelp and Wikipedia datasets demonstrate that our method maintains high detection stability under diverse state-of-the-art adversarial attacks. It achieves significantly higher accuracy than existing SOTA baselines, substantially improving the robustness and reliability of malicious user identification in online platforms.

Technology Category

Application Category

📝 Abstract

Detecting bad actors is critical to ensure the safety and integrity of internet platforms. Several deep learning-based models have been developed to identify such users. These models should not only accurately detect bad actors, but also be robust against adversarial attacks that aim to evade detection. However, past deep learning-based detection models do not meet the robustness requirement because they are sensitive to even minor changes in the input sequence. To address this issue, we focus on (1) improving the model understanding capability and (2) enhancing the model knowledge such that the model can recognize potential input modifications when making predictions. To achieve these goals, we create a novel transformer-based classification model, called ROBAD (RObust adversary-aware local-global attended Bad Actor Detection model), which uses the sequence of user posts to generate user embedding to detect bad actors. Particularly, ROBAD first leverages the transformer encoder block to encode each post bidirectionally, thus building a post embedding to capture the local information at the post level. Next, it adopts the transformer decoder block to model the sequential pattern in the post embeddings by using the attention mechanism, which generates the sequence embedding to obtain the global information at the sequence level. Finally, to enrich the knowledge of the model, embeddings of modified sequences by mimicked attackers are fed into a contrastive-learning-enhanced classification layer for sequence prediction. In essence, by capturing the local and global information (i.e., the post and sequence information) and leveraging the mimicked behaviors of bad actors in training, ROBAD can be robust to adversarial attacks. Extensive experiments on Yelp and Wikipedia datasets show that ROBAD can effectively detect bad actors when under state-of-the-art adversarial attacks.

Problem

Research questions and friction points this paper is trying to address.

Detect bad actors robustly against adversarial attacks

Improve model understanding and knowledge for input modifications

Enhance detection using local-global post sequence embeddings

Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformer-based local-global attention model

Contrastive-learning-enhanced classification layer

Adversarial-aware sequence embedding training

🔎 Similar Papers

No similar papers found.