Spectral Scalpel: Amplifying Adjacent Action Discrepancy via Frequency-Selective Filtering for Skeleton-Based Action Segmentation

📅 2026-03-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenges of weak discriminability between adjacent actions and ambiguous boundaries in skeleton-based temporal action segmentation by introducing frequency-domain analysis into this task for the first time. The proposed frequency-aware modeling paradigm features an adaptive multi-scale spectral filter—acting as a “scalpel” to precisely modulate frequency components—an inter-action difference loss to enhance inter-class separability, and a frequency-aware channel mixer. By effectively suppressing shared frequencies across actions while amplifying action-specific spectral signatures, the method significantly sharpens action boundaries. Extensive experiments on five public datasets demonstrate state-of-the-art performance, substantially alleviating boundary localization ambiguity and inter-class confusion.

Technology Category

Application Category

📝 Abstract
Skeleton-based Temporal Action Segmentation (STAS) seeks to densely segment and classify diverse actions within long, untrimmed skeletal motion sequences. However, existing STAS methodologies face challenges of limited inter-class discriminability and blurred segmentation boundaries, primarily due to insufficient distinction of spatio-temporal patterns between adjacent actions. To address these limitations, we propose Spectral Scalpel, a frequency-selective filtering framework aimed at suppressing shared frequency components between adjacent distinct actions while amplifying their action-specific frequencies, thereby enhancing inter-action discrepancies and sharpening transition boundaries. Specifically, Spectral Scalpel employs adaptive multi-scale spectral filters as scalpels to edit frequency spectra, coupled with a discrepancy loss between adjacent actions serving as the surgical objective. This design amplifies representational disparities between neighboring actions, effectively mitigating boundary localization ambiguities and inter-class confusion. Furthermore, complementing long-term temporal modeling, we introduce a frequency-aware channel mixer to strengthen channel evolution by aggregating spectra across channels. This work presents a novel paradigm for STAS that extends conventional spatio-temporal modeling by incorporating frequency-domain analysis. Extensive experiments on five public datasets demonstrate that Spectral Scalpel achieves state-of-the-art performance. Code is available at https://github.com/HaoyuJi/SpecScalpel.
Problem

Research questions and friction points this paper is trying to address.

Temporal Action Segmentation
Skeleton-based Action Recognition
Inter-class Discriminability
Boundary Ambiguity
Spatio-temporal Patterns
Innovation

Methods, ideas, or system contributions that make the work stand out.

frequency-selective filtering
action segmentation
skeleton-based action recognition
spectral analysis
boundary sharpening
🔎 Similar Papers
No similar papers found.
H
Haoyu Ji
Harbin Institute of Technology, Shenzhen
B
Bowen Chen
Shenzhen H&T Intelligent Control Co., Ltd.
Z
Zhihao Yang
Harbin Institute of Technology, Shenzhen
W
Wenze Huang
Harbin Institute of Technology, Shenzhen
Y
Yu Gao
Harbin Institute of Technology, Shenzhen
X
Xueting Liu
Southern University of Science and Technology
Weihong Ren
Weihong Ren
Harbin Institute of Technology, Shenzhen
image restorationmultiple object trackingaction detection
Zhiyong Wang
Zhiyong Wang
Harbin Institute of Technology, Shenzhen
Human-computer interaction
Honghai Liu
Honghai Liu
Portsmouth University
Human-Machine SystemsMulti-Sensory Data Fusion and Information AnalyticsBio-MechatronicsPattern RecognitionIntelligent Robotics