Mixup Regularization: A Probabilistic Perspective

📅 2025-02-19

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

This paper addresses the fundamental incompatibility between conventional Mixup regularization—designed for classification—and probabilistic modeling in conditional density estimation. To resolve this, we propose the first Mixup framework grounded in probabilistic blending: rather than convexly interpolating inputs and labels, our method analytically fuses likelihood functions via log-linear combination within the exponential family. Crucially, this enables feature-level probabilistic interpolation at arbitrary intermediate layers of neural networks—a capability absent in prior approaches. We provide theoretical guarantees establishing closed-form solvability of the fusion under exponential-family assumptions and demonstrate its superior generalization error bounds compared to standard Mixup. Empirical evaluation across synthetic and real-world benchmarks shows consistent and significant improvements in conditional density estimation accuracy, outperforming both standard Mixup and its leading variants with enhanced stability.

Technology Category

Application Category

📝 Abstract

In recent years, mixup regularization has gained popularity as an effective way to improve the generalization performance of deep learning models by training on convex combinations of training data. While many mixup variants have been explored, the proper adoption of the technique to conditional density estimation and probabilistic machine learning remains relatively unexplored. This work introduces a novel framework for mixup regularization based on probabilistic fusion that is better suited for conditional density estimation tasks. For data distributed according to a member of the exponential family, we show that likelihood functions can be analytically fused using log-linear pooling. We further propose an extension of probabilistic mixup, which allows for fusion of inputs at an arbitrary intermediate layer of the neural network. We provide a theoretical analysis comparing our approach to standard mixup variants. Empirical results on synthetic and real datasets demonstrate the benefits of our proposed framework compared to existing mixup variants.

Problem

Research questions and friction points this paper is trying to address.

Enhances generalization in deep learning models.

Improves conditional density estimation in machine learning.

Proposes probabilistic fusion for better data handling.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Probabilistic fusion for mixup

Log-linear pooling likelihoods

Fusion at intermediate layers

🔎 Similar Papers

No similar papers found.