🤖 AI Summary
This paper investigates the theoretical Lₚ error bounds for density ratio estimation (DRE) under f-divergence losses. For any Lipschitz-continuous density ratio estimator, we derive the first explicit quantitative relationship linking the Lₚ error to the Kullback–Leibler (KL) divergence, ambient data dimension d, and the moment term 𝔼[(r(X))ᵖ]. Our analysis reveals that when p > 1, the Lₚ error grows exponentially with KL divergence and deteriorates with increasing dimensionality and larger 𝔼[(r(X))ᵖ]. The derived upper and lower bounds are tight and extend to general f-divergence losses. Methodologically, we integrate variational representations of f-divergences, Lipschitz function analysis, and probabilistic inequalities, with comprehensive numerical validation. The key contribution lies in uncovering the exponential amplification mechanism of Lₚ error driven by KL divergence—providing a foundational theoretical framework for error control and model selection in high-dimensional DRE.
📝 Abstract
Density ratio estimation (DRE) is a core technique in machine learning used to capture relationships between two probability distributions. $f$-divergence loss functions, which are derived from variational representations of $f$-divergence, have become a standard choice in DRE for achieving cutting-edge performance. This study provides novel theoretical insights into DRE by deriving upper and lower bounds on the $L_p$ errors through $f$-divergence loss functions. These bounds apply to any estimator belonging to a class of Lipschitz continuous estimators, irrespective of the specific $f$-divergence loss function employed. The derived bounds are expressed as a product involving the data dimensionality and the expected value of the density ratio raised to the $p$-th power. Notably, the lower bound includes an exponential term that depends on the Kullback--Leibler (KL) divergence, revealing that the $L_p$ error increases significantly as the KL divergence grows when $p>1$. This increase becomes even more pronounced as the value of $p$ grows. The theoretical insights are validated through numerical experiments.