🤖 AI Summary
Least Absolute Deviations (LAD) linear fitting offers robustness to outliers, yet existing methods either rely on inefficient linear programming solvers (e.g., Barrodale–Roberts) or employ Iteratively Reweighted Least Squares (IRLS) approximations that lack convergence guarantees and numerical accuracy—further hindered by the absence of efficient, exact, and open-source implementations.
Method: We propose the first strictly exact, easily implementable, and directly deployable LAD line-fitting algorithm. It constructs a piecewise affine lower bound (PALB) via subgradients and integrates a minimum-driven adaptive interval subdivision strategy, ensuring finite-step convergence with a theoretical iteration upper bound.
Contribution/Results: Implemented efficiently in Rust with a Python interface, the algorithm exhibits empirical logarithmic-linear time complexity on both synthetic and NOAA real-world datasets. It significantly outperforms state-of-the-art LP and IRLS solvers in speed while delivering exact solutions—filling a critical gap in robust, efficient, and precise fitting algorithms.
📝 Abstract
Least-absolute-deviations (LAD) line fitting is robust to outliers but computationally more involved than least squares regression. Although the literature includes linear and near-linear time algorithms for the LAD line fitting problem, these methods are difficult to implement and, to our knowledge, lack maintained public implementations. As a result, practitioners often resort to linear programming (LP) based methods such as the simplex-based Barrodale-Roberts method and interior-point methods, or on iteratively reweighted least squares (IRLS) approximation which does not guarantee exact solutions. To close this gap, we propose the Piecewise Affine Lower-Bounding (PALB) method, an exact algorithm for LAD line fitting. PALB uses supporting lines derived from subgradients to build piecewise-affine lower bounds, and employs a subdivision scheme involving minima of these lower bounds. We prove correctness and provide bounds on the number of iterations. On synthetic datasets with varied signal types and noise including heavy-tailed outliers as well as a real dataset from the NOAA's Integrated Surface Database, PALB exhibits empirical log-linear scaling. It is consistently faster than publicly available implementations of LP based and IRLS based solvers. We provide a reference implementation written in Rust with a Python API.