Learning through Creation: A Hash-Free Framework for On-the-Fly Category Discovery

📅 2026-03-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the inconsistency between training objectives and the online novel class discovery task during inference, as well as the limited representational capacity of existing methods that rely on hashing or feature compression. To overcome these limitations, the authors propose a hash-free, full-feature framework that explicitly models the novel class discovery task during offline training for the first time. The approach integrates a lightweight pseudo-unknown sample generator that co-evolves with the model, along with a joint strategy of minimum kernel energy and maximum entropy (MKEE), a dual large-margin objective function, and an adaptive thresholding mechanism. Evaluated on seven benchmark datasets, the method significantly outperforms current state-of-the-art approaches, achieving absolute gains of 1.5% to 13.1% in overall class accuracy.

Technology Category

Application Category

📝 Abstract
On-the-Fly Category Discovery (OCD) aims to recognize known classes while simultaneously discovering emerging novel categories during inference, using supervision only from known classes during offline training. Existing approaches rely either on fixed label supervision or on diffusion-based augmentations to enhance the backbone, yet none of them explicitly train the model to perform the discovery task required at test time. It is fundamentally unreasonable to expect a model optimized on limited labeled data to carry out a qualitatively different discovery objective during inference. This mismatch creates a clear optimization misalignment between the offline learning stage and the online discovery stage. In addition, prior methods often depend on hash-based encodings or severe feature compression, which further limits representational capacity. To address these issues, we propose Learning through Creation (LTC), a fully feature-based and hash-free framework that injects novel-category awareness directly into offline learning. At its core is a lightweight, online pseudo-unknown generator driven by kernel-energy minimization and entropy maximization (MKEE). Unlike previous methods that generate synthetic samples once before training, our generator evolves jointly with the model dynamics and synthesizes pseudo-novel instances on the fly at negligible cost. These samples are incorporated through a dual max-margin objective with adaptive thresholding, strengthening the model's ability to delineate and detect unknown regions through explicit creation. Extensive experiments across seven benchmarks show that LTC consistently outperforms prior work, achieving improvements ranging from 1.5 percent to 13.1 percent in all-class accuracy. The code is available at https://github.com/brandinzhang/LTC
Problem

Research questions and friction points this paper is trying to address.

On-the-Fly Category Discovery
optimization misalignment
hash-based encoding
feature compression
novel category discovery
Innovation

Methods, ideas, or system contributions that make the work stand out.

On-the-Fly Category Discovery
hash-free framework
pseudo-unknown generation
kernel-energy minimization
entropy maximization
🔎 Similar Papers
No similar papers found.
B
Bohan Zhang
College of Information and Electrical Engineering, China Agricultural University, China
W
Weidong Tang
College of Information and Electrical Engineering, China Agricultural University, China
Zhixiang Chi
Zhixiang Chi
University of Toronto
Computer VisionMachine Learning
Yi Jin
Yi Jin
Beijing Jiaotong University
computer vision,machine learning
Z
Zhenbo Li
College of Information and Electrical Engineering, China Agricultural University, China
Yang Wang
Yang Wang
Computer Science, Concordia University
computer visionmachine learningdeep learningartificial intelligence
Yanan Wu
Yanan Wu
China Medical University | NEU (PhD) | CUHK (RA)
Medical Image Analysis