RefOnce: Distilling References into a Prototype Memory for Referring Camouflaged Object Detection

📅 2025-11-25

📈 Citations: 0

✨ Influential: 0

career value

166K/year

🤖 AI Summary

Existing reference-based camouflage object detection (Ref-COD) methods rely on test-time reference images, resulting in deployment difficulties, high inference latency, and substantial data acquisition overhead. To address this, we propose a reference-free Ref-COD framework that eliminates the conventional dual-branch architecture and introduces, for the first time, a category prototype memory mechanism: prototypes are dynamically constructed during training via exponential moving average (EMA). At inference, adaptive hybrid weights are generated conditioned on query text or image features to perform conditional fusion of prototypes; a bidirectional attention alignment module further enables implicit modeling of camouflaged features against reference statistical priors. Crucially, our method fully removes the need for reference images during testing. Evaluated on the R2C7K benchmark, it achieves state-of-the-art or competitive performance while significantly reducing latency and data collection costs, thereby enhancing practical deployability.

Technology Category

Application Category

📝 Abstract

Referring Camouflaged Object Detection (Ref-COD) segments specified camouflaged objects in a scene by leveraging a small set of referring images. Though effective, current systems adopt a dual-branch design that requires reference images at test time, which limits deployability and adds latency and data-collection burden. We introduce a Ref-COD framework that distills references into a class-prototype memory during training and synthesizes a reference vector at inference via a query-conditioned mixture of prototypes. Concretely, we maintain an EMA-updated prototype per category and predict mixture weights from the query to produce a guidance vector without any test-time references. To bridge the representation gap between reference statistics and camouflaged query features, we propose a bidirectional attention alignment module that adapts both the query features and the class representation. Thus, our approach yields a simple, efficient path to Ref-COD without mandatory references. We evaluate the proposed method on the large-scale R2C7K benchmark. Extensive experiments demonstrate competitive or superior performance of the proposed method compared with recent state-of-the-arts. Code is available at https://github.com/yuhuan-wu/RefOnce.

Problem

Research questions and friction points this paper is trying to address.

Eliminating test-time reference images for camouflaged object detection

Distilling category prototypes into memory during training phase

Generating reference vectors through query-conditioned prototype mixtures

Innovation

Methods, ideas, or system contributions that make the work stand out.

Distilling references into class-prototype memory during training

Synthesizing reference vector via query-conditioned prototype mixture

Using bidirectional attention to align query and class representations

🔎 Similar Papers

Shifting Spotlight for Co-supervision: A Simple yet Efficient Single-branch Network to See Through Camouflage