๐ค AI Summary
Existing tabular regression methods predominantly focus on point estimation and lack explicit modeling of predictive uncertainty, limiting their applicability in high-stakes, safety-critical decision-making scenarios such as industrial automation. Conventional probabilistic regression approaches often assume Gaussian target distributions, which are insufficient to capture complex, multimodal, or heavy-tailed real-world data distributions. To address these limitations, we propose TabResFlowโthe first probabilistic tabular regression model that integrates conditional spline flows for univariate target modeling, thereby eliminating restrictive parametric distributional assumptions. TabResFlow employs a dedicated MLP-based feature encoder, a ResNet backbone, and a novel conditional flow architecture to enable flexible, expressive density estimation. Evaluated on nine benchmark datasets, it achieves a 9.64% improvement in test log-likelihood over the state-of-the-art TreeFlow and operates 5.6ร faster than NodeFlow during inference. On a real-world used-car pricing task, TabResFlow attains the best Area-Under-the-Reliability-Curve (AURC) score.
๐ Abstract
Tabular regression is a well-studied problem with numerous industrial applications, yet most existing approaches focus on point estimation, often leading to overconfident predictions. This issue is particularly critical in industrial automation, where trustworthy decision-making is essential. Probabilistic regression models address this challenge by modeling prediction uncertainty. However, many conventional methods assume a fixed-shape distribution (typically Gaussian), and resort to estimating distribution parameters. This assumption is often restrictive, as real-world target distributions can be highly complex. To overcome this limitation, we introduce TabResFlow, a Normalizing Spline Flow model designed specifically for univariate tabular regression, where commonly used simple flow networks like RealNVP and Masked Autoregressive Flow (MAF) are unsuitable. TabResFlow consists of three key components: (1) An MLP encoder for each numerical feature. (2) A fully connected ResNet backbone for expressive feature extraction. (3) A conditional spline-based normalizing flow for flexible and tractable density estimation. We evaluate TabResFlow on nine public benchmark datasets, demonstrating that it consistently surpasses existing probabilistic regression models on likelihood scores. Our results demonstrate 9.64% improvement compared to the strongest probabilistic regression model (TreeFlow), and on average 5.6 times speed-up in inference time compared to the strongest deep learning alternative (NodeFlow). Additionally, we validate the practical applicability of TabResFlow in a real-world used car price prediction task under selective regression. To measure performance in this setting, we introduce a novel Area Under Risk Coverage (AURC) metric and show that TabResFlow achieves superior results across this metric.