🤖 AI Summary
Existing generative approaches struggle to controllably enhance intra-class diversity in deep metric learning, limiting the benefits of synthetic data for downstream tasks. This work proposes a novel method that, for the first time, introduces set operations into the denoising residual space of diffusion models. By applying union operations to activate co-occurring attributes from multiple prompts and intersection operations to extract principal component directions, the approach enables fine-grained, controllable intra-class image generation. Integrating diffusion models, text embeddings, and principal component analysis, the method achieves consistent improvements over state-of-the-art techniques across multiple standard deep metric learning benchmarks, with Recall@1 gains of 3.7% on CUB-200 and 1.8% on Cars-196.
📝 Abstract
The rise of Deep Generative Models (DGM) has enabled the generation of high-quality synthetic data. When used to augment authentic data in Deep Metric Learning (DML), these synthetic samples enhance intra-class diversity and improve the performance of downstream DML tasks. We introduce BLenDeR, a diffusion sampling method designed to increase intra-class diversity for DML in a controllable way by leveraging set-theory inspired union and intersection operations on denoising residuals. The union operation encourages any attribute present across multiple prompts, while the intersection extracts the common direction through a principal component surrogate. These operations enable controlled synthesis of diverse attribute combinations within each class, addressing key limitations of existing generative approaches. Experiments on standard DML benchmarks demonstrate that BLenDeR consistently outperforms state-of-the-art baselines across multiple datasets and backbones. Specifically, BLenDeR achieves 3.7% increase in Recall@1 on CUB-200 and a 1.8% increase on Cars-196, compared to state-of-the-art baselines under standard experimental settings.