Scalable Unsupervised Segmentation via Random Fourier Feature-based Gaussian Process

📅 2025-07-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the high computational complexity of Gaussian Process Hidden Semi-Markov Models (GP-HSMMs) in large-scale time-series segmentation—arising from explicit computation and inversion of the $N imes N$ kernel matrix—this paper proposes RFF-GP-HSMM, an efficient unsupervised time-series segmentation method based on Random Fourier Features (RFF). By embedding RFF into the GP-HSMM framework, it replaces kernel-based Gaussian process modeling with low-dimensional linear regression, thereby eliminating the need for kernel matrix inversion. This reduces time complexity from $O(N^3)$ to $O(NM^2)$, where $M ll N$. To the best of our knowledge, RFF-GP-HSMM is the first unsupervised segmentation model that achieves kernel-matrix-free inference while preserving the expressive power of GP-HSMMs. On the CMU Motion Capture dataset, it attains segmentation accuracy comparable to standard GP-HSMM, yet accelerates processing of a 39,200-frame sequence by up to 278×.

Technology Category

Application Category

📝 Abstract
In this paper, we propose RFF-GP-HSMM, a fast unsupervised time-series segmentation method that incorporates random Fourier features (RFF) to address the high computational cost of the Gaussian process hidden semi-Markov model (GP-HSMM). GP-HSMM models time-series data using Gaussian processes, requiring inversion of an N times N kernel matrix during training, where N is the number of data points. As the scale of the data increases, matrix inversion incurs a significant computational cost. To address this, the proposed method approximates the Gaussian process with linear regression using RFF, preserving expressive power while eliminating the need for inversion of the kernel matrix. Experiments on the Carnegie Mellon University (CMU) motion-capture dataset demonstrate that the proposed method achieves segmentation performance comparable to that of conventional methods, with approximately 278 times faster segmentation on time-series data comprising 39,200 frames.
Problem

Research questions and friction points this paper is trying to address.

Reduces high computational cost in GP-HSMM
Approximates Gaussian process using RFF
Enables fast unsupervised time-series segmentation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses random Fourier features for approximation
Eliminates kernel matrix inversion requirement
Achieves fast unsupervised time-series segmentation
🔎 Similar Papers
No similar papers found.
I
Issei Saito
The University of Electro-Communications, Tokyo, Japan
M
Masatoshi Nagano
Kyoto University, Kyoto, Japan
Tomoaki Nakamura
Tomoaki Nakamura
The University of Electro-Communications
Daichi Mochihashi
Daichi Mochihashi
The Institute of Statistical Mathematics
natural language processingmachine learning
K
Koki Mimura
National Center of Neurology and Psychiatry, Tokyo, Japan