🤖 AI Summary
To address the challenge of modeling massive, high-dimensional, and multimodal consumer transaction trajectories in the world’s largest payment network, this paper introduces TransFormer—the first foundation model specifically designed for transaction data. Methodologically, it proposes a novel 3D-Transformer architecture that jointly models temporal dynamics, merchant/category semantic dimensions, and LLM-generated embedding dimensions, thereby enhancing multimodal fusion efficiency and representation capacity. The model supports both transaction trajectory understanding and generation, and unifies diverse downstream tasks—including sales forecasting, fraud detection, and user segmentation—under a single framework. Trained on over one billion real-world anonymized transactions, TransFormer achieves an average accuracy improvement of 12.7% across multiple benchmarks, attains 3.2× faster inference speed than current production models, and demonstrates strong capability in future trajectory generation. This work establishes a new paradigm for foundation models in transaction analytics.
📝 Abstract
We present TransactionGPT (TGPT), a foundation model for consumer transaction data within one of world's largest payment networks. TGPT is designed to understand and generate transaction trajectories while simultaneously supporting a variety of downstream prediction and classification tasks. We introduce a novel 3D-Transformer architecture specifically tailored for capturing the complex dynamics in payment transaction data. This architecture incorporates design innovations that enhance modality fusion and computational efficiency, while seamlessly enabling joint optimization with downstream objectives. Trained on billion-scale real-world transactions, TGPT significantly improves downstream classification performance against a competitive production model and exhibits advantages over baselines in generating future transactions. We conduct extensive empirical evaluations utilizing a diverse collection of company transaction datasets spanning multiple downstream tasks, thereby enabling a thorough assessment of TGPT's effectiveness and efficiency in comparison to established methodologies. Furthermore, we examine the incorporation of LLM-derived embeddings within TGPT and benchmark its performance against fine-tuned LLMs, demonstrating that TGPT achieves superior predictive accuracy as well as faster training and inference. We anticipate that the architectural innovations and practical guidelines from this work will advance foundation models for transaction-like data and catalyze future research in this emerging field.