🤖 AI Summary
To address real-world data containing arbitrary outliers, this paper proposes a unified robust machine learning training framework that bridges the two dominant paradigms of robust loss design: M-estimation and risk minimization. Methodologically, it introduces (1) a novel modified Black–Rangarajan duality theory, enabling formal unification of these paradigms; (2) an anomaly-probability-driven Adaptive Alternating Algorithm (AAA), which is hyperparameter-free and exhibits broader convergence domains; and (3) a coupled optimization strategy jointly tuning the robust kernel scale parameter σ and minimizing a weighted non-robust loss, ensuring theoretical convergence guarantees. Extensive experiments demonstrate significant improvements in model robustness across regression, classification, and neural scene reconstruction tasks. The framework’s open-source implementation on GitHub further validates its effectiveness and generalizability.
📝 Abstract
Robust training of machine learning models in the presence of outliers has garnered attention across various domains. The use of robust losses is a popular approach and is known to mitigate the impact of outliers. We bring to light two literatures that have diverged in their ways of designing robust losses: one using M-estimation, which is popular in robotics and computer vision, and another using a risk-minimization framework, which is popular in deep learning. We first show that a simple modification of the Black-Rangarajan duality provides a unifying view. The modified duality brings out a definition of a robust loss kernel $sigma$ that is satisfied by robust losses in both the literatures. Secondly, using the modified duality, we propose an Adaptive Alternation Algorithm (AAA) for training machine learning models with outliers. The algorithm iteratively trains the model by using a weighted version of the non-robust loss, while updating the weights at each iteration. The algorithm is augmented with a novel parameter update rule by interpreting the weights as inlier probabilities, and obviates the need for complex parameter tuning. Thirdly, we investigate convergence of the adaptive alternation algorithm to outlier-free optima. Considering arbitrary outliers (i.e., with no distributional assumption on the outliers), we show that the use of robust loss kernels {sigma} increases the region of convergence. We experimentally show the efficacy of our algorithm on regression, classification, and neural scene reconstruction problems. We release our implementation code: https://github.com/MIT-SPARK/ORT.