🤖 AI Summary
To address prediction uncertainty arising from class imbalance in healthcare AI, this paper proposes RIGA, a cross-modal generative balancing framework. RIGA first encodes tabular health data into synthetic images, then leverages conditional GANs (cGANs) or Vector-Quantized GANs (VQGANs) to generate high-fidelity minority-class samples in the image domain, and finally decodes these synthesized images back into structured tabular format to augment downstream modeling. This work introduces the novel “tabular → image → generation → reconstruction” paradigm, enabling unified performance gains across diverse learners—including XGBoost, Bayesian structure learning algorithms, and robustness evaluation methods. Experiments demonstrate that RIGA significantly improves minority-class classification accuracy and model robustness, produces high-quality, semantically consistent synthetic data, and integrates seamlessly into existing healthcare AI pipelines. By bridging representation learning and generative data augmentation, RIGA advances trustworthy machine learning for high-stakes clinical decision support.
📝 Abstract
Understanding and managing uncertainty is crucial in machine learning, especially in high-stakes domains like healthcare, where class imbalance can impact predictions. This paper introduces RIGA, a novel pipeline that mitigates class imbalance using generative AI. By converting tabular healthcare data into images, RIGA leverages models like cGAN, VQVAE, and VQGAN to generate balanced samples, improving classification performance. These representations are processed by CNNs and later transformed back into tabular format for seamless integration. This approach enhances traditional classifiers like XGBoost, improves Bayesian structure learning, and strengthens ML model robustness by generating realistic synthetic data for underrepresented classes.