๐ค AI Summary
Existing human activity recognition (HAR) methods rely heavily on large-scale labeled datasets, domain-specific training, and substantial computational resources, limiting their applicability in low-resource domains such as healthcare and rehabilitation. To address this, we propose the first training-free retrieval-augmented generation (RAG) framework for HARโrequiring neither fine-tuning nor annotated data. Our approach leverages lightweight statistical features for semantic similarity-based sample retrieval and employs an LLM-driven, context-aware activity descriptor to construct an enhanced vector repository, enabling zero-shot activity recognition and natural-language semantic labeling. Crucially, this work pioneers the integration of the RAG paradigm into HAR, yielding interpretable generalization to unseen activities. Evaluated across six heterogeneous benchmarks, our method achieves state-of-the-art performance while significantly improving practicality and deployability in low-resource settings.
๐ Abstract
Human Activity Recognition (HAR) underpins applications in healthcare, rehabilitation, fitness tracking, and smart environments, yet existing deep learning approaches demand dataset-specific training, large labeled corpora, and significant computational resources.We introduce RAG-HAR, a training-free retrieval-augmented framework that leverages large language models (LLMs) for HAR. RAG-HAR computes lightweight statistical descriptors, retrieves semantically similar samples from a vector database, and uses this contextual evidence to make LLM-based activity identification. We further enhance RAG-HAR by first applying prompt optimization and introducing an LLM-based activity descriptor that generates context-enriched vector databases for delivering accurate and highly relevant contextual information. Along with these mechanisms, RAG-HAR achieves state-of-the-art performance across six diverse HAR benchmarks. Most importantly, RAG-HAR attains these improvements without requiring model training or fine-tuning, emphasizing its robustness and practical applicability. RAG-HAR moves beyond known behaviors, enabling the recognition and meaningful labelling of multiple unseen human activities.