🤖 AI Summary
This study addresses the reliability challenges in pain assessment at both clinical and societal levels by proposing a lightweight Transformer architecture for automated pain recognition based on functional near-infrared spectroscopy (fNIRS) signals. The method integrates raw waveforms and time-frequency representations—specifically power spectral density—through a unified tokenization mechanism, and employs a structured segment-wise aggregation strategy to preserve multi-perspective signal characteristics within a shared latent space, thereby avoiding modality-specific customization. Without increasing model complexity, the approach jointly models spatial, temporal, and time-frequency features, achieving competitive performance on the AI4Pain dataset while supporting real-time inference on both GPU and CPU platforms, thus balancing accuracy and computational efficiency.
📝 Abstract
Pain is a multifaceted and widespread phenomenon with substantial clinical and societal burden, making reliable automated assessment a critical objective. This paper presents a lightweight transformer architecture that fuses multiple fNIRS representations through a unified tokenization mechanism, enabling joint modeling of complementary signal views without requiring modality-specific adaptations or increasing architectural complexity. The proposed token-mixing strategy preserves spatial, temporal, and time-frequency characteristics by projecting heterogeneous inputs onto a shared latent representation, using a structured segmentation scheme to control the granularity of local aggregation and global interaction. The model is evaluated on the AI4Pain dataset using stacked raw waveform and power spectral density representations of fNIRS inputs. Experimental results demonstrate competitive pain recognition performance while remaining computationally compact, making the approach suitable for real-time inference on both GPU and CPU hardware.