MMbeddings: Parameter-Efficient, Low-Overfitting Probabilistic Embeddings Inspired by Nonlinear Mixed Models

📅 2025-10-25

📈 Citations: 0

✨ Influential: 0

career value

212K/year

🤖 AI Summary

Traditional embedding methods for high-cardinality categorical data suffer from excessive parameter counts—scaling linearly with dictionary size—and thus incur high computational costs and overfitting risks. To address this, we propose MMbeddings, a probabilistic embedding framework based on variational autoencoders that introduces nonlinear mixture models into deep embedding learning for the first time. It models categorical variables as stochastic latent variables drawn from learnable distributions and employs a shared encoder to produce low-dimensional embeddings. By replacing fixed embeddings with random effects, MMbeddings reduces parameter count by over 90% while significantly improving generalization. Extensive experiments on collaborative filtering and tabular regression tasks demonstrate that MMbeddings consistently outperforms state-of-the-art baselines across multiple real-world and synthetic datasets, achieving substantial reductions in memory footprint and training cost.

Technology Category

Application Category

📝 Abstract

We present MMbeddings, a probabilistic embedding approach that reinterprets categorical embeddings through the lens of nonlinear mixed models, effectively bridging classical statistical theory with modern deep learning. By treating embeddings as latent random effects within a variational autoencoder framework, our method substantially decreases the number of parameters -- from the conventional embedding approach of cardinality $ imes$ embedding dimension, which quickly becomes infeasible with large cardinalities, to a significantly smaller, cardinality-independent number determined primarily by the encoder architecture. This reduction dramatically mitigates overfitting and computational burden in high-cardinality settings. Extensive experiments on simulated and real datasets, encompassing collaborative filtering and tabular regression tasks using varied architectures, demonstrate that MMbeddings consistently outperforms traditional embeddings, underscoring its potential across diverse machine learning applications.

Problem

Research questions and friction points this paper is trying to address.

Reduces parameters in high-cardinality categorical embeddings

Mitigates overfitting through probabilistic latent random effects

Bridges classical statistical theory with deep learning frameworks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Probabilistic embeddings reinterpret categorical embeddings via nonlinear mixed models

Treats embeddings as latent random effects in variational autoencoder framework

Reduces parameters to cardinality-independent number for overfitting mitigation

🔎 Similar Papers

Towards One Model for Classical Dimensionality Reduction: A Probabilistic Perspective on UMAP and t-SNE