🤖 AI Summary
This work addresses the limitations of conventional regression loss functions, which are typically based on absolute error and thus ill-suited for tasks involving multiplicative noise or where relative error is of primary concern. For the first time, the paper systematically investigates ratio-based loss functions defined in terms of the quotient between predicted and true values. Through rigorous mathematical analysis—agnostic to any specific learning algorithm—it examines fundamental properties such as continuity, Lipschitz continuity, convexity, and differentiability. The study fills a critical gap in the theoretical understanding of this class of losses, introduces several novel loss formulations, and establishes a general analytical framework that lays the groundwork for future research on consistency, learning rates, and algorithmic stability.
📝 Abstract
Algorithms in machine learning and AI do critically depend on at least three key components: (i) the risk function, which is the expectation of the loss function, (ii) the function space, which is often called the hypothesis space, and (iii) the set of probability measures, which are allowed for the specified algorithm. This paper gives a survey of a certain class of loss functions, which we call ratio-based. In supervised learning, margin-based loss functions for classification tasks depending on the product of the output values $y_i$ and the predictions $f(x_i)$ as well as distance-based loss functions depending on the difference of $y_i$ and $f(x_i)$ for regression are common. Distance-based loss functions are in particular useful, if an additive model assumption seems plausible, i.e. the common signal plus noise assumption. However, in the literature, several loss functions proposed for regression purposes have a multiplicative error structure in mind and pay attention to relative errors, i.e. to the ratio of $y_i$ and $f(x_i)$. In this survey article, we systematically investigate such ratio-based loss functions and propose a few new losses, which may be interesting for future research. We concentrate on investigating general properties of ratio-based loss functions like continuity, Lipschitz-continuity, convexity, and differentiability, because these properties play a central role in most machine learning algorithms. Therefore, we do not focus on some specific machine learning algorithm to derive universal consistency, learning rates, or stability results. Instead, we want to enable future research in this direction.