XNNTab -- Interpretable Neural Networks for Tabular Data using Sparse Autoencoders

📅 2025-12-15

📈 Citations: 0

✨ Influential: 0

career value

178K/year

🤖 AI Summary

To address the interpretability gap of neural networks in tabular data modeling—where their black-box nature hinders trust and regulatory compliance—this paper proposes a novel interpretable neural architecture that embeds Sparse Autoencoders (SAEs) into the backbone network. This integration enables semantically unambiguous decomposition of nonlinear features and explicit alignment with human-understandable concepts. The resulting model supports end-to-end interpretable prediction, with reasoning paths amenable to manual verification. Experiments demonstrate that our method matches the predictive accuracy of state-of-the-art black-box models (e.g., MLPs, XGBoost), while substantially outperforming conventional interpretable baselines (e.g., decision trees, linear models). The core contribution lies in the first deep structural coupling of SAEs as intrinsic interpretability modules within neural networks—thereby reconciling high representational capacity with concept-level transparency.

Technology Category

Application Category

📝 Abstract

In data-driven applications relying on tabular data, where interpretability is key, machine learning models such as decision trees and linear regression are applied. Although neural networks can provide higher predictive performance, they are not used because of their blackbox nature. In this work, we present XNNTab, a neural architecture that combines the expressiveness of neural networks and interpretability. XNNTab first learns highly non-linear feature representations, which are decomposed into monosemantic features using a sparse autoencoder (SAE). These features are then assigned human-interpretable concepts, making the overall model prediction intrinsically interpretable. XNNTab outperforms interpretable predictive models, and achieves comparable performance to its non-interpretable counterparts.

Problem

Research questions and friction points this paper is trying to address.

Develops interpretable neural networks for tabular data

Uses sparse autoencoders to decompose features into interpretable concepts

Balances high predictive performance with model transparency

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses sparse autoencoders for feature decomposition

Assigns human-interpretable concepts to learned features

Combines neural network expressiveness with interpretability

🔎 Similar Papers

No similar papers found.