MixerCA: An Efficient and Accurate Model for High-Performance Hyperspectral Image Classification

📅 2026-04-28

📈 Citations: 0

✨ Influential: 0

career value

188K/year

🤖 AI Summary

This study addresses the challenge of efficiently modeling complex spatial and spectral features in hyperspectral image classification. The authors propose MixerCA, a lightweight model that, for the first time, unifies depthwise separable convolution, a channel-token mixing mechanism, and coordinate attention within a single architecture. This design effectively decouples spatial and channel interactions while directly processing hyperspectral image patches at their original resolution. Evaluated on four standard hyperspectral datasets, MixerCA significantly outperforms mainstream models—including 2D/3D-CNNs, Tri-CNN, HybridSN, Vision Transformer (ViT), and Swin Transformer—achieving high classification accuracy with substantially lower computational overhead.

📝 Abstract

Over the past decade, hyperspectral image (HSI) classification has drawn considerable interest due to HSIs' ability to effectively distinguish terrestrial objects by capturing detailed, continuous spectral information. The strong performance of recent deep learning techniques in tasks like image classification and semantic segmentation has led to their growing use in HSI classification, due to their ability to capture complex spatial and spectral features more effectively than traditional methods. This paper presents MixerCA, a novel lightweight model for HSI classification that leverages depthwise convolution and a self-attention mechanism. MixerCA integrates depth-wise convolutions, token and channel mixing, and coordinate attention into a unified structure to decouple spatial and channel interactions, maintain consistent resolution throughout the network, and directly process HSI patches. Extensive experiments on four hyperspectral benchmark datasets reveal MixerCA's clear advantages over several competing algorithms, including 2D-CNN, 3D-CNN, Tri-CNN, HybridSN, ViT, and Swin Transformer. The source code is publicly available at https://github.com/mqalkhatib/MixerCA.

Problem

Research questions and friction points this paper is trying to address.

hyperspectral image classification

deep learning

spatial-spectral features

efficient model

high-performance

Innovation

Methods, ideas, or system contributions that make the work stand out.

depthwise convolution

coordinate attention

token mixing