When Tukey meets Chauvenet: a new boxplot criterion for outlier detection

📅 2025-06-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Tukey’s boxplot employs fixed fences independent of sample size, compromising outlier detection reliability; Chauvenet’s criterion incorporates sample size but suffers from poor robustness and sensitivity to outliers. To address these limitations, we propose the Chauvenet-type boxplot—the first method integrating Chauvenet’s statistical control principle into Tukey’s framework. Leveraging order statistics and normal approximation theory, it derives sample-size-adaptive fence coefficients and guarantees strict control of the per-point outlier probability. The method is robust, interpretable, and user-friendly, implemented in the R package *ChauBoxplot*, available on CRAN. Simulation studies and empirical analysis of Hong Kong civil servant salary data demonstrate that our approach achieves more stable outlier detection power and significantly tighter false positive rate control across diverse sample sizes, outperforming conventional methods.

Technology Category

Application Category

📝 Abstract
The box-and-whisker plot, introduced by Tukey (1977), is one of the most popular graphical methods in descriptive statistics. On the other hand, however, Tukey's boxplot is free of sample size, yielding the so-called"one-size-fits-all"fences for outlier detection. Although improvements on the sample size adjusted boxplots do exist in the literature, most of them are either not easy to implement or lack justification. As another common rule for outlier detection, Chauvenet's criterion uses the sample mean and standard derivation to perform the test, but it is often sensitive to the included outliers and hence is not robust. In this paper, by combining Tukey's boxplot and Chauvenet's criterion, we introduce a new boxplot, namely the Chauvenet-type boxplot, with the fence coefficient determined by an exact control of the outside rate per observation. Our new outlier criterion not only maintains the simplicity of the boxplot from a practical perspective, but also serves as a robust Chauvenet's criterion. Simulation study and a real data analysis on the civil service pay adjustment in Hong Kong demonstrate that the Chauvenet-type boxplot performs extremely well regardless of the sample size, and can therefore be highly recommended for practical use to replace both Tukey's boxplot and Chauvenet's criterion. Lastly, to increase the visibility of the work, a user-friendly R package named `ChauBoxplot' has also been officially released on CRAN.
Problem

Research questions and friction points this paper is trying to address.

Develops a new boxplot method for robust outlier detection
Combines Tukey's boxplot and Chauvenet's criterion advantages
Provides sample-size-independent outlier detection with easy implementation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines Tukey boxplot with Chauvenet criterion
Introduces Chauvenet-type boxplot for robustness
Provides user-friendly R package for implementation
🔎 Similar Papers
No similar papers found.
H
Hongmei Lin
School of Statistics and Data Science, Shanghai University of International Business and Economics, Shanghai, China
R
Riquan Zhang
School of Statistics and Data Science, Shanghai University of International Business and Economics, Shanghai, China
Tiejun Tong
Tiejun Tong
Professor of Statistics, Hong Kong Baptist University
StatisticsBiostatisticsMeta-analysisEvidence-based Practice