Boosting Skeleton-based Zero-Shot Action Recognition with Training-Free Test-Time Adaptation

πŸ“… 2025-12-12
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
To address insufficient generalization to unseen actions in skeleton-based zero-shot action recognition (SZAR), this paper proposes Skeleton-Cacheβ€”the first training-free test-time adaptation framework. It constructs a non-parametric cache jointly storing global and fine-grained multi-scale skeleton descriptors, and leverages large language models (LLMs) to generate semantic weights for dynamic, interpretable descriptor retrieval. Its core contributions are: (1) a test-time adaptation mechanism requiring no fine-tuning; (2) an LLM-guided, semantics-aware retrieval paradigm; and (3) end-to-end fusion of structured skeleton representations with textual priors. Evaluated on NTU RGB+D 60/120 and PKU-MMD II benchmarks, Skeleton-Cache consistently improves performance across diverse backbone architectures under both zero-shot and generalized zero-shot settings.

Technology Category

Application Category

πŸ“ Abstract
We introduce Skeleton-Cache, the first training-free test-time adaptation framework for skeleton-based zero-shot action recognition (SZAR), aimed at improving model generalization to unseen actions during inference. Skeleton-Cache reformulates inference as a lightweight retrieval process over a non-parametric cache that stores structured skeleton representations, combining both global and fine-grained local descriptors. To guide the fusion of descriptor-wise predictions, we leverage the semantic reasoning capabilities of large language models (LLMs) to assign class-specific importance weights. By integrating these structured descriptors with LLM-guided semantic priors, Skeleton-Cache dynamically adapts to unseen actions without any additional training or access to training data. Extensive experiments on NTU RGB+D 60/120 and PKU-MMD II demonstrate that Skeleton-Cache consistently boosts the performance of various SZAR backbones under both zero-shot and generalized zero-shot settings. The code is publicly available at https://github.com/Alchemist0754/Skeleton-Cache.
Problem

Research questions and friction points this paper is trying to address.

Improves skeleton-based zero-shot action recognition generalization
Dynamically adapts to unseen actions without training
Enhances performance using LLM-guided semantic priors
Innovation

Methods, ideas, or system contributions that make the work stand out.

Training-free test-time adaptation framework
Lightweight retrieval process with skeleton cache
LLM-guided semantic priors for descriptor fusion
πŸ”Ž Similar Papers