Kernel Density Estimation and Convolution Revisited

📅 2025-10-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Classical kernel density estimation (KDE) suffers from bandwidth sensitivity, severe boundary bias, and computational inefficiency—especially on bounded domains. To address these issues, we propose SHIDE: a convolution-based KDE method employing a bounded polynomial kernel derived from the uniform distribution, coupled with bandlimited noise injection for pseudo-data generation and histogram-based spline interpolation. SHIDE inherently respects domain constraints, eliminating the need for explicit boundary correction. We establish its theoretical convergence rate as the optimal (n^{-4/5}), demonstrate significantly reduced boundary bias, and prove asymptotic mean integrated squared error (AMISE) optimality. Empirical evaluations show that SHIDE matches or outperforms state-of-the-art KDE methods on bounded domains and heavy-tailed distributions, achieving both high accuracy and computational efficiency.

Technology Category

Application Category

📝 Abstract
Kernel Density Estimation (KDE) is a cornerstone of nonparametric statistics, yet it remains sensitive to bandwidth choice, boundary bias, and computational inefficiency. This study revisits KDE through a principled convolutional framework, providing an intuitive model-based derivation that naturally extends to constrained domains, such as positive-valued random variables. Building on this perspective, we introduce SHIDE (Simulation and Histogram Interpolation for Density Estimation), a novel and computationally efficient density estimator that generates pseudo-data by adding bounded noise to observations and applies spline interpolation to the resulting histogram. The noise is sampled from a class of bounded polynomial kernel densities, constructed through convolutions of uniform distributions, with a natural bandwidth parameter defined by the kernel's support bound. We establish the theoretical properties of SHIDE, including pointwise consistency, bias-variance decomposition, and asymptotic MISE, showing that SHIDE attains the classical $n^{-4/5}$ convergence rate while mitigating boundary bias. Two data-driven bandwidth selection methods are developed, an AMISE-optimal rule and a percentile-based alternative, which are shown to be asymptotically equivalent. Extensive simulations demonstrate that SHIDE performs comparably to or surpasses KDE across a broad range of models, with particular advantages for bounded and heavy-tailed distributions. These results highlight SHIDE as a theoretically grounded and practically robust alternative to traditional KDE.
Problem

Research questions and friction points this paper is trying to address.

Addresses bandwidth sensitivity and boundary bias in kernel density estimation
Develops efficient density estimator for bounded domains and heavy-tailed distributions
Provides theoretical guarantees and data-driven bandwidth selection methods
Innovation

Methods, ideas, or system contributions that make the work stand out.

Simulates pseudo-data with bounded noise
Interpolates histograms using spline techniques
Uses polynomial kernels from uniform convolutions
🔎 Similar Papers
No similar papers found.
N
Nicholas Tenkorang
Department of Mathematical Sciences, University of Texas, El Paso, TX 79968
K
Kwesi Appau Ohene-Obeng
Department of Mathematical Sciences, University of Texas, El Paso, TX 79968
Xiaogang Su
Xiaogang Su
Professor, University of Texas at El Paso (UTEP)
StatisticsMachine Learning