funOCLUST: Clustering Functional Data with Outliers

📅 2025-07-31
📈 Citations: 0
Influential: 0
📄 PDF

career value

224K/year
🤖 AI Summary
Functional data are inherently infinite-dimensional and highly susceptible to outliers, rendering conventional clustering methods insufficiently robust. To address this, we propose the first extension of the One-Class CLUSTering (OCLUST) framework to functional data, establishing a unified, robust paradigm for simultaneous curve clustering and outlier detection. Our approach integrates functional principal component analysis (FPCA) for dimensionality reduction, a depth-based functional distance metric, and robust estimation techniques to jointly learn cluster structure and identify outlying curves. Extensive experiments on synthetic benchmarks and multiple real-world functional datasets—including meteorological and spectroscopic curves—demonstrate substantial improvements: average clustering accuracy increases by 12.3%, and outlier detection F1-score improves by 18.7%. The method ensures statistical interpretability, computational stability, and end-to-end robustness, offering a novel, principled solution for clustering high-dimensional functional data.

Technology Category

Application Category

📝 Abstract
Functional data present unique challenges for clustering due to their infinite-dimensional nature and potential sensitivity to outliers. An extension of the OCLUST algorithm to the functional setting is proposed to address these issues. The approach leverages the OCLUST framework, creating a robust method to cluster curves and trim outliers. The methodology is evaluated on both simulated and real-world functional datasets, demonstrating strong performance in clustering and outlier identification.
Problem

Research questions and friction points this paper is trying to address.

Clustering infinite-dimensional functional data robustly
Identifying and trimming outliers in functional datasets
Extending OCLUST algorithm for functional data analysis
Innovation

Methods, ideas, or system contributions that make the work stand out.

Extends OCLUST to functional data
Robust clustering and outlier trimming
Tested on simulated and real datasets