🤖 AI Summary
To address the challenge of effectively modeling high-cardinality, multidimensional, and long-range user behavioral sequences—where conventional large language models fall short—this paper proposes a generative pre-training framework specifically designed for behavioral sequences. Our method introduces three key innovations: (1) a structured tokenization technique that maps discrete behavioral events into semantically coherent subword units; (2) a periodic pattern recognition module that explicitly captures temporal regularities in user behavior; and (3) a unified user representation embedding mechanism integrating static attributes and dynamic behaviors, accelerated by offline caching for millisecond-level online inference. Deployed at scale in WeChat Pay, our approach improves next-transaction prediction HitRate@1 by 25.6% and fraud detection recall by 38.6%. On cross-domain public benchmarks, it outperforms Transformer baselines by up to 21% in HitRate@1, demonstrating strong representational capacity, high interpretability, and industrial-grade deployment efficiency.
📝 Abstract
Large language models (LLMs) have shown that generative pretraining can distill vast world knowledge into compact token representations. While LLMs encapsulate extensive world knowledge, they remain limited in modeling the behavioral knowledge contained within user interaction histories. User behavior forms a distinct modality, where each action, defined by multi-dimensional attributes such as time, context, and transaction type, constitutes a behavioral token. Modeling these high-cardinality sequences is challenging, and discriminative models often falter under limited supervision. To bridge this gap, we extend generative pretraining to user behavior, learning transferable representations from unlabeled behavioral data analogous to how LLMs learn from text. We present PANTHER, a hybrid generative-discriminative framework that unifies user behavior pretraining and downstream adaptation, enabling large-scale sequential user representation learning and real-time inference. PANTHER introduces: (1) Structured Tokenization to compress multi-dimensional transaction attributes into an interpretable vocabulary; (2) Sequence Pattern Recognition Module (SPRM) for modeling periodic transaction motifs; (3) a Unified User-Profile Embedding that fuses static demographics with dynamic transaction histories; and (4) Real-time scalability enabled by offline caching of pretrained embeddings for millisecond-level inference. Fully deployed and operational online at WeChat Pay, PANTHER delivers a 25.6 percent boost in next-transaction prediction HitRate@1 and a 38.6 percent relative improvement in fraud detection recall over baselines. Cross-domain evaluations on public benchmarks show strong generalization, achieving up to 21 percent HitRate@1 gains over transformer baselines, establishing PANTHER as a scalable, high-performance framework for industrial sequential user behavior modeling.