Does Feasibility Matter? Understanding the Impact of Feasibility on Synthetic Training Data

📅 2025-05-15

📈 Citations: 0

✨ Influential: 0

career value

167K/year

🤖 AI Summary

This work investigates the impact of “feasibility”—i.e., whether image attributes adhere to real-world physical constraints—on CLIP classifier training performance, focusing on background, color, and texture. Method: We propose a quantifiable feasibility metric and the VariReal editing framework, which leverages diffusion models to generate controllable feasible/infeasible samples; precise intervention is achieved via LLM-guided prompting and LoRA fine-tuning. Contribution/Results: Experiments across three fine-grained datasets show that CLIP models trained on mixed feasible and infeasible synthetic data exhibit top-1 accuracy differences <0.3%, with no statistically significant performance degradation. This constitutes the first systematic empirical validation that feasibility is not a necessary condition for effective CLIP supervised training. Our findings establish a new paradigm for efficient, low-cost synthetic data curation—enabling high-fidelity model training without strict adherence to physical realism.

Technology Category

Application Category

📝 Abstract

With the development of photorealistic diffusion models, models trained in part or fully on synthetic data achieve progressively better results. However, diffusion models still routinely generate images that would not exist in reality, such as a dog floating above the ground or with unrealistic texture artifacts. We define the concept of feasibility as whether attributes in a synthetic image could realistically exist in the real-world domain; synthetic images containing attributes that violate this criterion are considered infeasible. Intuitively, infeasible images are typically considered out-of-distribution; thus, training on such images is expected to hinder a model's ability to generalize to real-world data, and they should therefore be excluded from the training set whenever possible. However, does feasibility really matter? In this paper, we investigate whether enforcing feasibility is necessary when generating synthetic training data for CLIP-based classifiers, focusing on three target attributes: background, color, and texture. We introduce VariReal, a pipeline that minimally edits a given source image to include feasible or infeasible attributes given by the textual prompt generated by a large language model. Our experiments show that feasibility minimally affects LoRA-fine-tuned CLIP performance, with mostly less than 0.3% difference in top-1 accuracy across three fine-grained datasets. Also, the attribute matters on whether the feasible/infeasible images adversarially influence the classification performance. Finally, mixing feasible and infeasible images in training datasets does not significantly impact performance compared to using purely feasible or infeasible datasets.

Problem

Research questions and friction points this paper is trying to address.

Investigates impact of feasibility in synthetic training data on model performance

Examines whether enforcing realistic attributes improves CLIP classifier generalization

Tests if mixing feasible and infeasible images affects classification accuracy

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses photorealistic diffusion models for synthetic data

Introduces VariReal pipeline for attribute editing

Tests feasibility impact on CLIP classifier performance

🔎 Similar Papers

The Unmet Promise of Synthetic Training Images: Using Retrieved Real Images Performs Better