Autoencoder-based General Purpose Representation Learning for Customer Embedding

📅 2024-02-28

📈 Citations: 1

✨ Influential: 0

career value

185K/year

🤖 AI Summary

Heterogeneous, high-dimensional, and sparse tabular customer data impede effective entity representation learning. Method: This paper proposes DEEPCAE—a multi-layer contractive autoencoder framework for generic entity embedding. Its core innovation lies in (i) the first formalization of a unified optimization objective for generic entity embedding, and (ii) the design of a gradient-sensitive regularization term coupled with hierarchical feature constraints, enabling end-to-end joint optimization of reconstruction fidelity and downstream predictive performance. Results: Extensive experiments across 13 real-world customer datasets demonstrate that DEEPCAE reduces reconstruction error by 34% compared to stacked CAEs and consistently outperforms state-of-the-art autoencoder variants on diverse downstream tasks—including classification and regression—thereby validating its effectiveness and generalizability.

Technology Category

Application Category

📝 Abstract

Recent advances in representation learning have successfully leveraged the underlying domain-specific structure of data across various fields. However, representing diverse and complex entities stored in tabular format within a latent space remains challenging. In this paper, we introduce DEEPCAE, a novel method for calculating the regularization term for multi-layer contractive autoencoders (CAEs). Additionally, we formalize a general-purpose entity embedding framework and use it to empirically show that DEEPCAE outperforms all other tested autoencoder variants in both reconstruction performance and downstream prediction performance. Notably, when compared to a stacked CAE across 13 datasets, DEEPCAE achieves a 34% improvement in reconstruction error.

Problem

Research questions and friction points this paper is trying to address.

Autoencoder-based representation learning

Customer embedding in tabular data

DEEPCAE regularization term optimization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Autoencoder-based representation learning

Multi-layer contractive autoencoders

General-purpose entity embedding framework

🔎 Similar Papers

Learning Transactions Representations for Information Management in Banks: Mastering Local, Global, and External Knowledge