Mixtures of multivariate linear asymmetric Laplace regressions with multiple asymmetric Laplace covariates

📅 2025-05-09
📈 Citations: 0
Influential: 0
📄 PDF

career value

201K/year
🤖 AI Summary
To address the challenges of outlier detection and weak robustness in joint clustering of response and random covariates under non-Gaussian, skewed data—particularly for distinguishing typical points, mild outliers, and good/bad leverage points—this paper proposes the contaminated Skew-Laplace Clustering Weighted Model (cSALCWM). For the first time within a clustering-weighted framework, cSALCWM enables fine-grained discrimination of all four outlier types and establishes rigorous theoretical identifiability conditions. Leveraging a multivariate skew-Laplace distribution augmented with heavy-tailed contamination, we develop an EM-type Expectation-Conditional Maximization algorithm and provide an open-source R implementation (GitHub). Simulation studies and real-data analyses demonstrate that cSALCWM substantially improves outlier detection accuracy and parameter estimation robustness, achieving both strong robustness and high interpretability.

Technology Category

Application Category

📝 Abstract
In response to the challenge of accommodating non-Gaussian behaviour in data, the shifted asymmetric Laplace (SAL) cluster-weighted model (SALCWM) is introduced as a model-based method for jointly clustering responses and random covariates that exhibit skewness. Within each cluster, the multivariate SAL distribution is assumed for both the covariates and the responses given the covariates. To mitigate the effect of possible atypical observations, a heavy-tailed extension, the contaminated SALCWM (cSALCWM), is also proposed. In addition to the SALCWM parameters, each mixture component has a parameter controlling the proportion of outliers, one controlling the proportion of leverage points, one specifying the degree of outlierness, and another specifying the degree of leverage. The cSALCWM has the added benefit that once the model parameters are estimated and the observations are assigned to components, a more refined intra-group classification in typical points, (mild) outliers, good leverage, and bad leverage points can be directly obtained. An expectation-conditional maximization algorithm is developed for efficient maximum likelihood parameter estimation under this framework. Theoretical identifiability conditions are established, and empirical results from simulation studies and validation via real-world applications demonstrate that the cSALCWM not only preserves the modelling strengths of the SALCWM but also significantly enhances outlier detection and overall inference reliability. The methodology proposed in this paper has been implemented in an exttt{R} package, which is publicly available at https://github.com/arnootto/ALCWM.
Problem

Research questions and friction points this paper is trying to address.

Model non-Gaussian skewed data via SALCWM for clustering
Extend SALCWM to cSALCWM for robust outlier handling
Develop ECM algorithm for efficient parameter estimation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses shifted asymmetric Laplace for clustering
Extends model with heavy-tailed outlier handling
Implements ECM algorithm for parameter estimation
🔎 Similar Papers
No similar papers found.
A
A. F. Otto
Department of Statistics, University of Pretoria, Pretoria, South Africa
A
Andriëtte Bekker
Department of Statistics, University of Pretoria, Pretoria, South Africa
A
A. Punzo
Department of Economics and Business, University of Catania, Catania, Italy
J
Johannes T. Ferreira
School of Statistics and Actuarial Science, University of the Witwatersrand, Johannesburg, South Africa
Cristina Tortora
Cristina Tortora
Professor San Jose State University
Statistics