FUN: A Focal U-Net Combining Reconstruction and Object Detection for Snapshot Spectral Imaging

📅 2026-04-30
📈 Citations: 0
Influential: 0
📄 PDF

career value

228K/year
🤖 AI Summary
This work addresses the high computational cost of hyperspectral image reconstruction in snapshot spectral imaging, which hinders real-time object detection. To overcome this limitation, the authors propose an end-to-end Focal U-Net (FUN) framework that jointly optimizes reconstruction and detection through a shared U-shaped backbone. FUN replaces self-attention with a focal modulation mechanism, achieving a self-attention-free architecture that significantly reduces computational complexity. The study also introduces the first hyperspectral object detection dataset, containing 8,712 annotated targets. Experimental results demonstrate that FUN achieves state-of-the-art performance on both reconstruction and detection tasks while reducing model parameters by 40% and computational cost by 30%, making it suitable for real-time deployment on edge devices.
📝 Abstract
Conventional push-broom hyperspectral imaging suffers from slow acquisition speeds, precluding real-time object detection; in contrast, snapshot spectral imaging enables instantaneous hyperspectral images (HSIs) capture, making real-time object detection feasible, yet its potential is often compromised by time-consuming post-capture reconstruction. To address this issue, we propose the Focal U-shaped Network (FUN), a novel end-to-end framework that jointly performs HSI reconstruction and object detection via multi-task learning. FUN employs a shared U-shaped backbone, where reconstruction provides underlying spectral information while detection guides semantic-aware priors learning, facilitating mutually beneficial task interaction. Crucially, we introduce focal modulation, an efficient alternative to self-attention that modulates spatial and spectral features while reducing quadratic computational complexity, enabling a self-attention-free architecture for joint reconstruction and detection. Furthermore, we contribute a new HSI object detection dataset with 8712 annotated objects across 363 HSIs to facilitate evaluation of the proposed method. Experiments demonstrate that FUN achieves state-of-the-art performance on both tasks, using 40% fewer parameters and 30% less computation than recent alternatives, making it promising for future real-time edge deployment. The code and datasets are available: https://github.com/ShawnDong98/FUN.
Problem

Research questions and friction points this paper is trying to address.

snapshot spectral imaging
hyperspectral image reconstruction
real-time object detection
computational complexity
Innovation

Methods, ideas, or system contributions that make the work stand out.

Focal U-Net
snapshot spectral imaging
multi-task learning
focal modulation
hyperspectral object detection
🔎 Similar Papers
No similar papers found.
D
Dahua Gao
School of Artificial Intelligence, Xidian University, Xi’an 710126, China
Y
Yubo Dong
Changzhi Medical College, the Engineering Research Centre for Intelligent Data Assisted Diagnosis and Treatment in Shanxi Province, and Uniwave Artificial Intelligence Technology Co., Ltd., Changzhi 046000, China
A
Anqi Li
School of Artificial Intelligence, Xidian University, Xi’an 710126, China
Z
Zhenyuan Lin
School of Artificial Intelligence, Xidian University, Xi’an 710126, China
A
Ang Gao
School of Artificial Intelligence, Xidian University, Xi’an 710126, China
D
Danhua Liu
School of Artificial Intelligence, Xidian University, Xi’an 710126, China
Guangming Shi
Guangming Shi
School of Electronic Engineering, Xidian University, China; Peng Cheng Laboratory
compressed sensingacquisition and processing of remote sensing imagesmultimedia image communicationmedical imaging