TCLeaf-Net: a transformer-convolution framework with global-local attention for robust in-field lesion-level plant leaf disease detection

📅 2025-12-13
📈 Citations: 0
Influential: 0
📄 PDF

career value

187K/year
🤖 AI Summary
To address challenges in field-based leaf lesion detection—including complex background interference, domain shift, and scarcity of lesion-level annotations—this paper introduces Daylily-Leaf, the first paired field-scale lesion-level dataset. We propose TCLeaf-Net, a hybrid architecture synergistically integrating Transformer and CNN components. Key innovations include: (1) the TCM background suppression module, (2) the RSFRS raw-scale detail preservation block, and (3) the DFPN deformable multi-scale feature alignment mechanism—collectively enhancing detection robustness. Experiments on the Daylily-Leaf field subset achieve an mAP@50 of 78.2%, outperforming baseline methods by 5.4 percentage points. The model reduces computational cost by 7.5 GFLOPs and memory usage by 8.7%. Moreover, it demonstrates superior cross-domain generalization compared to YOLO and RT-DETR variants, validating its efficacy under real-world agricultural conditions.

Technology Category

Application Category

📝 Abstract
Timely and accurate detection of foliar diseases is vital for safeguarding crop growth and reducing yield losses. Yet, in real-field conditions, cluttered backgrounds, domain shifts, and limited lesion-level datasets hinder robust modeling. To address these challenges, we release Daylily-Leaf, a paired lesion-level dataset comprising 1,746 RGB images and 7,839 lesions captured under both ideal and in-field conditions, and propose TCLeaf-Net, a transformer-convolution hybrid detector optimized for real-field use. TCLeaf-Net is designed to tackle three major challenges. To mitigate interference from complex backgrounds, the transformer-convolution module (TCM) couples global context with locality-preserving convolution to suppress non-leaf regions. To reduce information loss during downsampling, the raw-scale feature recalling and sampling (RSFRS) block combines bilinear resampling and convolution to preserve fine spatial detail. To handle variations in lesion scale and feature shifts, the deformable alignment block with FPN (DFPN) employs offset-based alignment and multi-receptive-field perception to strengthen multi-scale fusion. Experimental results show that on the in-field split of the Daylily-Leaf dataset, TCLeaf-Net improves mAP@50 by 5.4 percentage points over the baseline model, reaching 78.2%, while reducing computation by 7.5 GFLOPs and GPU memory usage by 8.7%. Moreover, the model outperforms recent YOLO and RT-DETR series in both precision and recall, and demonstrates strong performance on the PlantDoc, Tomato-Leaf, and Rice-Leaf datasets, validating its robustness and generalizability to other plant disease detection scenarios.
Problem

Research questions and friction points this paper is trying to address.

Detects plant leaf diseases in real-field conditions with cluttered backgrounds.
Addresses domain shifts and limited lesion-level datasets for robust modeling.
Improves multi-scale lesion detection while reducing computational and memory costs.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformer-convolution hybrid for global-local attention
Raw-scale feature recalling to preserve spatial detail
Deformable alignment with FPN for multi-scale fusion
🔎 Similar Papers