SciLT: Long-Tailed Classification in Scientific Image Domains

๐Ÿ“… 2026-04-04
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses the challenge of long-tailed classification in scientific images, where significant domain shifts from natural images limit the effectiveness of standard foundation model fine-tuning approaches. The study presents the first systematic analysis of the limitations of foundation models in this setting and introduces SciLT, a parameter-efficient fine-tuning framework that adaptively fuses multi-level features and employs dual supervision to jointly optimize both the final layer and the penultimate layerโ€”features particularly critical for tail-class recognition. By explicitly balancing performance across head and tail classes, SciLT achieves substantial gains over existing methods on three scientific image benchmarks, establishing a strong and practical new baseline for long-tailed recognition in scientific domains.
๐Ÿ“ Abstract
Long-tailed recognition has benefited from foundation models and fine-tuning paradigms, yet existing studies and benchmarks are mainly confined to natural image domains, where pre-training and fine-tuning data share similar distributions. In contrast, scientific images exhibit distinct visual characteristics and supervision signals, raising questions about the effectiveness of fine-tuning foundation models in such settings. In this work, we investigate scientific long-tailed recognition under a purely visual and parameter-efficient fine-tuning (PEFT) paradigm. Experiments on three scientific benchmarks show that fine-tuning foundation models yields limited gains, and reveal that penultimate-layer features play an important role, particularly for tail classes. Motivated by these findings, we propose SciLT, a framework that exploits multi-level representations through adaptive feature fusion and dual-supervision learning. By jointly leveraging penultimate- and final-layer features, SciLT achieves balanced performance across head and tail classes. Extensive experiments demonstrate that SciLT consistently outperforms existing methods, establishing a strong and practical baseline for scientific long-tailed recognition and providing valuable guidance for adapting foundation models to scientific data with substantial domain shifts.
Problem

Research questions and friction points this paper is trying to address.

long-tailed recognition
scientific images
foundation models
domain shift
fine-tuning
Innovation

Methods, ideas, or system contributions that make the work stand out.

long-tailed recognition
scientific images
parameter-efficient fine-tuning
multi-level feature fusion
dual-supervision learning
๐Ÿ”Ž Similar Papers
No similar papers found.