SelaFD:Seamless Adaptation of Vision Transformer Fine-tuning for Radar-based Human Activity

๐Ÿ“… 2025-02-07
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses the limited performance of fully fine-tuned Vision Transformers (ViTs) for Human Activity Recognition (HAR), particularly fall detection, on radar time-Doppler mapsโ€”stemming from the non-visual nature of radar signals and high inter-class similarity in time-frequency representations. To overcome this, we propose a joint lightweight fine-tuning paradigm that simultaneously adapts both weight and feature spaces: Low-Rank Adaptation (LoRA) is applied to the weight space, while a novel serial-parallel Adapter architecture is designed for the feature space, enabling efficient and seamless ViT adaptation to radar domains. Our method significantly enhances model generalization and achieves state-of-the-art performance across multiple radar HAR benchmarks, with up to a 3.2% absolute accuracy improvement over prior methods. The source code is publicly available.

Technology Category

Application Category

๐Ÿ“ Abstract
Human Activity Recognition (HAR) such as fall detection has become increasingly critical due to the aging population, necessitating effective monitoring systems to prevent serious injuries and fatalities associated with falls. This study focuses on fine-tuning the Vision Transformer (ViT) model specifically for HAR using radar-based Time-Doppler signatures. Unlike traditional image datasets, these signals present unique challenges due to their non-visual nature and the high degree of similarity among various activities. Directly fine-tuning the ViT with all parameters proves suboptimal for this application. To address this challenge, we propose a novel approach that employs Low-Rank Adaptation (LoRA) fine-tuning in the weight space to facilitate knowledge transfer from pre-trained ViT models. Additionally, to extract fine-grained features, we enhance feature representation through the integration of a serial-parallel adapter in the feature space. Our innovative joint fine-tuning method, tailored for radar-based Time-Doppler signatures, significantly improves HAR accuracy, surpassing existing state-of-the-art methodologies in this domain. Our code is released at https://github.com/wangyijunlyy/SelaFD.
Problem

Research questions and friction points this paper is trying to address.

Fine-tuning Vision Transformer for radar-based HAR
Overcoming non-visual signal challenges in HAR
Enhancing HAR accuracy with LoRA and adapters
Innovation

Methods, ideas, or system contributions that make the work stand out.

Low-Rank Adaptation fine-tuning
Serial-parallel adapter integration
Vision Transformer for radar signals
๐Ÿ”Ž Similar Papers
No similar papers found.
Y
Yijun Wang
Key Laboratory of Underwater Acoustic Signal Processing of Ministry of Education, Southeast University, Nanjing, 210096, China
Y
Yong Wang
Key Laboratory of Underwater Acoustic Signal Processing of Ministry of Education, Southeast University, Nanjing, 210096, China
C
Chendong Xu
Key Laboratory of Underwater Acoustic Signal Processing of Ministry of Education, Southeast University, Nanjing, 210096, China
S
Shuai Yao
Key Laboratory of Underwater Acoustic Signal Processing of Ministry of Education, Southeast University, Nanjing, 210096, China
Qisong Wu
Qisong Wu
Postdoc associate of electrical and computer engineering, Duke University
Nonparametric Bayesian analysissynthetic aperture radar imaging