Enhancing Kernel Power K-means: Scalable and Robust Clustering with Random Fourier Features and Possibilistic Method

📅 2025-11-13

📈 Citations: 0

✨ Influential: 0

career value

227K/year

🤖 AI Summary

Kernel Power k-Means (KPKM) suffers from high computational complexity—due to reliance on the full kernel matrix—and weak noise robustness, stemming from the lack of joint centroid-sample learning. To address these limitations, we propose RFF-KPKM, the first method establishing a theoretically grounded framework for Random Fourier Features (RFF) approximation in KPKM, with provable strong consistency and explicit error bounds. Building upon this, we further introduce IP-RFF-MKPKM, which integrates possibilistic clustering with fuzzy membership within a multi-kernel learning paradigm to achieve robust and scalable clustering. Crucially, our approach avoids explicit kernel matrix construction, enabling efficient optimization via low-dimensional RFF mappings. Extensive experiments on large-scale datasets demonstrate that IP-RFF-MKPKM significantly outperforms state-of-the-art baselines in both clustering accuracy and computational efficiency, empirically validating the tightness of our theoretical bounds and its practical efficacy.

Technology Category

Application Category

📝 Abstract

Kernel power $k$-means (KPKM) leverages a family of means to mitigate local minima issues in kernel $k$-means. However, KPKM faces two key limitations: (1) the computational burden of the full kernel matrix restricts its use on extensive data, and (2) the lack of authentic centroid-sample assignment learning reduces its noise robustness. To overcome these challenges, we propose RFF-KPKM, introducing the first approximation theory for applying random Fourier features (RFF) to KPKM. RFF-KPKM employs RFF to generate efficient, low-dimensional feature maps, bypassing the need for the whole kernel matrix. Crucially, we are the first to establish strong theoretical guarantees for this combination: (1) an excess risk bound of $mathcal{O}(sqrt{k^3/n})$, (2) strong consistency with membership values, and (3) a $(1+varepsilon)$ relative error bound achievable using the RFF of dimension $mathrm{poly}(varepsilon^{-1}log k)$. Furthermore, to improve robustness and the ability to learn multiple kernels, we propose IP-RFF-MKPKM, an improved possibilistic RFF-based multiple kernel power $k$-means. IP-RFF-MKPKM ensures the scalability of MKPKM via RFF and refines cluster assignments by combining the merits of the possibilistic membership and fuzzy membership. Experiments on large-scale datasets demonstrate the superior efficiency and clustering accuracy of the proposed methods compared to the state-of-the-art alternatives.

Problem

Research questions and friction points this paper is trying to address.

Addresses scalability issues in kernel clustering using random Fourier features

Enhances noise robustness through improved centroid-sample assignment learning

Provides theoretical guarantees for clustering accuracy and computational efficiency

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses random Fourier features for scalable kernel approximation

Combines possibilistic and fuzzy memberships for robust clustering

Establishes theoretical guarantees for excess risk bounds

🔎 Similar Papers

Advanced Clustering Techniques for Speech Signal Enhancement: A Review and Metanalysis of Fuzzy C-Means, K-Means, and Kernel Fuzzy C-Means Methods