Generating Risky Samples with Conformity Constraints via Diffusion Models

📅 2025-12-21

📈 Citations: 0

✨ Influential: 0

career value

144K/year

🤖 AI Summary

Existing methods for generating high-risk samples struggle to preserve class consistency, introducing label noise and undermining practical utility. To address this, we propose a diffusion-based framework for high-risk sample generation, the first to leverage cross-modal text-image embeddings as an implicit class-consistency constraint. Our approach introduces three novel mechanisms: (1) an explicit conformity score to quantify class compliance; (2) embedding-space filtering to enhance fidelity; and (3) risk-gradient guidance to strengthen adversarial potency. Extensive experiments demonstrate that our method significantly outperforms state-of-the-art approaches across risk intensity, generation quality, and class consistency. Furthermore, when used for data augmentation, the generated high-risk samples effectively improve model robustness and generalization performance.

Technology Category

Application Category

📝 Abstract

Although neural networks achieve promising performance in many tasks, they may still fail when encountering some examples and bring about risks to applications. To discover risky samples, previous literature attempts to search for patterns of risky samples within existing datasets or inject perturbation into them. Yet in this way the diversity of risky samples is limited by the coverage of existing datasets. To overcome this limitation, recent works adopt diffusion models to produce new risky samples beyond the coverage of existing datasets. However, these methods struggle in the conformity between generated samples and expected categories, which could introduce label noise and severely limit their effectiveness in applications. To address this issue, we propose RiskyDiff that incorporates the embeddings of both texts and images as implicit constraints of category conformity. We also design a conformity score to further explicitly strengthen the category conformity, as well as introduce the mechanisms of embedding screening and risky gradient guidance to boost the risk of generated samples. Extensive experiments reveal that RiskyDiff greatly outperforms existing methods in terms of the degree of risk, generation quality, and conformity with conditioned categories. We also empirically show the generalization ability of the models can be enhanced by augmenting training data with generated samples of high conformity.

Problem

Research questions and friction points this paper is trying to address.

Generates risky samples beyond existing dataset coverage

Enhances conformity between generated samples and target categories

Improves model generalization using high-conformity generated data

Innovation

Methods, ideas, or system contributions that make the work stand out.

Incorporates text and image embeddings as implicit constraints

Designs conformity score to strengthen category alignment explicitly

Introduces embedding screening and risky gradient guidance mechanisms

🔎 Similar Papers

No similar papers found.