Disjoint Generative Models

📅 2025-07-25

📈 Citations: 0

✨ Influential: 0

career value

205K/year

🤖 AI Summary

Addressing the challenges of privacy preservation and fusion difficulty due to the absence of shared identifiers across subsets in cross-domain tabular data synthesis, this paper proposes a divide-and-conquer generative framework. It partitions the original data into mutually exclusive subsets, each modeled independently by dedicated generative models, and seamlessly integrates them via a posterior linking mechanism—requiring no shared variables, identifiers, or covariates. The framework supports heterogeneous generative models and significantly strengthens differential privacy guarantees. Crucially, it maintains high data utility while introducing only negligible statistical bias. Extensive experiments on multiple real-world tabular datasets demonstrate superior privacy–utility trade-offs, strong scalability, and broad compatibility with diverse generative modeling architectures.

Technology Category

Application Category

📝 Abstract

We propose a new framework for generating cross-sectional synthetic datasets via disjoint generative models. In this paradigm, a dataset is partitioned into disjoint subsets that are supplied to separate instances of generative models. The results are then combined post hoc by a joining operation that works in the absence of common variables/identifiers. The success of the framework is demonstrated through several case studies and examples on tabular data that helps illuminate some of the design choices that one may make. The principal benefit of disjoint generative models is significantly increased privacy at only a low utility cost. Additional findings include increased effectiveness and feasibility for certain model types and the possibility for mixed-model synthesis.

Problem

Research questions and friction points this paper is trying to address.

Generating synthetic datasets without common identifiers

Enhancing privacy with minimal utility loss

Enabling mixed-model synthesis for diverse data

Innovation

Methods, ideas, or system contributions that make the work stand out.

Disjoint generative models framework

Partitioned subsets for separate generation

Joining operation without common variables

🔎 Similar Papers

No similar papers found.