FMRFT: Fusion Mamba and DETR for Query Time Sequence Intersection Fish Tracking

📅 2024-09-02
🏛️ arXiv.org
📈 Citations: 3
Influential: 0
📄 PDF
🤖 AI Summary
To address accuracy and stability degradation in underwater sturgeon multi-object tracking—caused by specular reflections, high inter-class appearance similarity, rapid motion, and severe mutual occlusion—this paper introduces the first large-scale, factory-farming-oriented sturgeon tracking dataset featuring complex underwater scenes. We propose an end-to-end real-time tracking framework. Its core innovation is the Query Time Sequence Intersection (QTSI) module, which uniquely integrates Mamba-based temporal modeling with the RT-DETR query mechanism to enable cross-frame memory enhancement and precise occluded-target recovery. Further, a Mamba-in-Mamba (MIM) architecture strengthens temporal representation learning. Evaluated on our proprietary dataset, the method achieves IDF1 of 90.3% and MOTA of 94.3%, significantly outperforming state-of-the-art approaches. This robust performance enables reliable long-term monitoring of early abnormal behaviors—such as disease onset and starvation—in aquaculture environments.

Technology Category

Application Category

📝 Abstract
Early detection of abnormal fish behavior caused by disease or hunger can be achieved through fish tracking using deep learning techniques, which holds significant value for industrial aquaculture. However, underwater reflections and some reasons with fish, such as the high similarity, rapid swimming caused by stimuli and mutual occlusion bring challenges to multi-target tracking of fish. To address these challenges, this paper establishes a complex multi-scenario sturgeon tracking dataset and introduces the FMRFT model, a real-time end-to-end fish tracking solution. The model incorporates the low video memory consumption Mamba In Mamba (MIM) architecture, which facilitates multi-frame temporal memory and feature extraction, thereby addressing the challenges to track multiple fish across frames. Additionally, the FMRFT model with the Query Time Sequence Intersection (QTSI) module effectively manages occluded objects and reduces redundant tracking frames using the superior feature interaction and prior frame processing capabilities of RT-DETR. This combination significantly enhances the accuracy and stability of fish tracking. Trained and tested on the dataset, the model achieves an IDF1 score of 90.3% and a MOTA accuracy of 94.3%. Experimental results show that the proposed FMRFT model effectively addresses the challenges of high similarity and mutual occlusion in fish populations, enabling accurate tracking in factory farming environments.
Problem

Research questions and friction points this paper is trying to address.

Underwater Environment
Multi-fish Tracking
Aquaculture Management
Innovation

Methods, ideas, or system contributions that make the work stand out.

FMRFT
multi-fish tracking
occlusion handling
🔎 Similar Papers
No similar papers found.
M
Mingyuan Yao
National Innovation Center for Digital Fishery, No. 17, Qinghua East Road, Haidian District, Beijing, 100083, China; Key Laboratory of Smart Farming Technologies for Aquatic Animal and Livestock, Ministry of Agriculture and Rural Affairs, No. 17, Qinghua East Road, Haidian District, Beijing, 100083, China; College of Information and Electrical Engineering, China Agricultural University, No. 17, Qinghua East Road, Haidian District, Beijing, 10083, China
Y
Yukang Huo
National Innovation Center for Digital Fishery, No. 17, Qinghua East Road, Haidian District, Beijing, 100083, China; Key Laboratory of Smart Farming Technologies for Aquatic Animal and Livestock, Ministry of Agriculture and Rural Affairs, No. 17, Qinghua East Road, Haidian District, Beijing, 100083, China; College of Information and Electrical Engineering, China Agricultural University, No. 17, Qinghua East Road, Haidian District, Beijing, 10083, China
Q
Qingbin Tian
National Innovation Center for Digital Fishery, No. 17, Qinghua East Road, Haidian District, Beijing, 100083, China; Key Laboratory of Smart Farming Technologies for Aquatic Animal and Livestock, Ministry of Agriculture and Rural Affairs, No. 17, Qinghua East Road, Haidian District, Beijing, 100083, China; College of Information and Electrical Engineering, China Agricultural University, No. 17, Qinghua East Road, Haidian District, Beijing, 10083, China
Jiayin Zhao
Jiayin Zhao
Tsinghua University
Computational Imaging
X
Xiao Liu
National Innovation Center for Digital Fishery, No. 17, Qinghua East Road, Haidian District, Beijing, 100083, China; Key Laboratory of Smart Farming Technologies for Aquatic Animal and Livestock, Ministry of Agriculture and Rural Affairs, No. 17, Qinghua East Road, Haidian District, Beijing, 100083, China; College of Information and Electrical Engineering, China Agricultural University, No. 17, Qinghua East Road, Haidian District, Beijing, 10083, China
R
Rui-Feng Wang
College of Engineering, China Agricultural University, No. 17, Qinghua East Road, Haidian District, Beijing, 10083, China
H
Haihua Wang
National Innovation Center for Digital Fishery, No. 17, Qinghua East Road, Haidian District, Beijing, 100083, China; Key Laboratory of Smart Farming Technologies for Aquatic Animal and Livestock, Ministry of Agriculture and Rural Affairs, No. 17, Qinghua East Road, Haidian District, Beijing, 100083, China; College of Information and Electrical Engineering, China Agricultural University, No. 17, Qinghua East Road, Haidian District, Beijing, 10083, China