SYNTHONY: A Stress-Aware, Intent-Conditioned Agent for Deep Tabular Generative Models Selection

📅 2026-03-31
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the high performance variability of existing deep tabular generative models across datasets and their inability to reliably select the optimal model according to user intent—such as fidelity, privacy, or utility. The authors propose a stress-aware, intent-conditioned framework for generative model selection, introducing the novel concept of “stress profiles” as meta-features that decompose dataset difficulty into four interpretable dimensions, including long-tailed distributions and high-cardinality categories. By integrating a calibrated library of generator capabilities with a k-nearest neighbor selector, the framework enables precise, intent-aligned model recommendations. Experiments across seven datasets, ten generators, and three intent types demonstrate that the method achieves significantly higher Top-1 selection accuracy compared to zero-shot large language models and random baselines.
📝 Abstract
Deep generative models for tabular data (GANs, diffusion models, and LLM-based generators) exhibit highly non-uniform behavior across datasets; the best-performing synthesizer family depends strongly on distributional stressors such as long-tailed marginals, high-cardinality categorical, Zipfian imbalance, and small-sample regimes. This brittleness makes practical deployment challenging, especially when users must balance competing objectives of fidelity, privacy, and utility. We study {intent-conditioned tabular synthesis selection}: given a dataset and a user intent expressed as a preference over evaluation metrics, the goal is to select a synthesizer that minimizes regret relative to an intent-specific oracle. We propose {stress profiling}, a synthesis-specific meta-feature representation that quantifies dataset difficulty along four interpretable stress dimensions, and integrate it into {SYNTHONY}, a selection framework that matches stress profiles against a calibrated capability registry of synthesizer families. Across a benchmark of 7 datasets, 10 synthesizers, and 3 intents, we demonstrate that stress-based meta-features are highly predictive of synthesizer performance: a $k$NN selector using these features achieves strong Top-1 selection accuracy, substantially outperforming zero-shot LLM selectors and random baselines. We analyze the gap between meta-feature-based and capability-based selection, identifying the hand-crafted capability registry as the primary bottleneck and motivating learned capability representations as a direction for future work.
Problem

Research questions and friction points this paper is trying to address.

tabular generative models
model selection
distributional stressors
user intent
synthesizer performance
Innovation

Methods, ideas, or system contributions that make the work stand out.

stress profiling
intent-conditioned selection
tabular generative models
meta-features
SYNTHONY
🔎 Similar Papers
No similar papers found.
H
Hochan Son
Department of Statistics, University of California, Los Angeles
Xiaofeng Lin
Xiaofeng Lin
PhD Candidate, Boston University
Sequential Decision MakingRobotics
J
Jason Ni
Department of Mathematics, University of California, Los Angeles
G
Guang Cheng
Department of Statistics, University of California, Los Angeles