🤖 AI Summary
This paper addresses the geometric misalignment of traditional Mirror Descent (MD) and Generalized Exponential Gradient (GEG) algorithms when applied to non-Euclidean, heavy-tailed, or sparse data. To this end, we propose a unified optimization framework grounded in trace-form entropies—such as Tsallis and Kaniadakis entropies. Methodologically, we systematically introduce learnable, parameterized deformed logarithmic and exponential functions, construct their associated Bregman divergences, and thereby achieve geometry-adaptive regularization; this leads to novel MD and generalized multiplicative update algorithms. Our contributions are threefold: (1) establishing a rigorous theoretical闭环 linking deformed entropies, deformed functions, and corresponding Bregman divergences; (2) enabling adaptive learning of deformation parameters, which accelerates convergence and enhances generalization robustness; and (3) providing a more flexible and interpretable optimization foundation tailored to complex data structures.
📝 Abstract
In this paper we propose and investigate a wide class of Mirror Descent updates (MD) and associated novel Generalized Exponentiated Gradient (GEG) algorithms by exploiting various trace-form entropies and associated deformed logarithms and their inverses - deformed (generalized) exponential functions. The proposed algorithms can be considered as extension of entropic MD and generalization of multiplicative updates. In the literature, there exist nowadays over fifty mathematically well defined generalized entropies, so impossible to exploit all of them in one research paper. So we focus on a few selected most popular entropies and associated logarithms like the Tsallis, Kaniadakis and Sharma-Taneja-Mittal and some of their extension like Tempesta or Kaniadakis-Scarfone entropies. The shape and properties of the deformed logarithms and their inverses are tuned by one or more hyperparameters. By learning these hyperparameters, we can adapt to distribution of training data, which can be designed to the specific geometry of the optimization problem, leading to potentially faster convergence and better performance. The using generalized entropies and associated deformed logarithms in the Bregman divergence, used as a regularization term, provides some new insight into exponentiated gradient descent updates.