BIRDNet: Mining and Encoding Boolean Implication Knowledge Graphs as Interpretable Deep Neural Networks

📅 2026-05-27

📈 Citations: 0

✨ Influential: 0

career value

173K/year

🤖 AI Summary

This work proposes BIRDNet, an interpretable sparse neural network architecture designed to automatically discover and leverage Boolean implication relationships (BIRs) from tabular data. BIRs are identified via a sparse exception binomial test and encoded into a directed knowledge graph, which is then directly translated into a hierarchical neural network where each hidden unit corresponds to a symbolic rule and connections exist only between relevant features. This approach achieves, for the first time, an end-to-end transformation from a data-driven implication graph to a neural network structure, yielding inherent sparsity and symbolic-level interpretability without requiring predefined rules. Evaluated on six transcriptomic and proteomic benchmarks, BIRDNet matches strong baseline performance (AUROC gap ≤ 0.02) using up to 96× fewer active parameters and successfully recovers known biological markers of cancer subtypes and tissue types.

📝 Abstract

Tabular data in knowledge-rich domains often carries a latent prior in the form of Boolean implication relationships (BIRs) between pairs of features. We mine such relationships with a sparse-exception binomial test. The mined implications form a typed directed graph, equivalent to a propositional rule base of 2-literal clauses. We encode this graph as the connectivity of a layered neural network, called BIRDNet, in which each hidden unit corresponds to one mined rule and binds only to its two features. We show two consequences of this design: First, the architecture is sparse by construction: at most $2/d$ of the weights in each BIR layer are active, where $d$ is the input dimension. Second, the model is interpretable: every trained unit keeps a stable symbolic identity, so rules can be read off the network without surrogate models. Unlike most neurosymbolic models, BIRDNet does not consume an external rule base; its structural prior is mined from the data. We evaluate BIRDNet on six transcriptomic and proteomic benchmarks. Our results show that BIRDNet stays within 0.02 AUROC of the strongest dense baseline, at a small accuracy cost, while using up to $96\times$ fewer active parameters than an architecture-matched dense MLP. First-layer rules recover known biological signatures across multiple cancer subtypes and tissue types, including canonical amplicons, lineage-defining co-expression modules, and immune-infiltration markers. Data and code are available at: https://github.com/MAHI-Group/BIRDNet.

Problem

Research questions and friction points this paper is trying to address.

Boolean implication relationships

interpretable deep learning

knowledge graph

tabular data

neurosymbolic models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Boolean implication

interpretable neural network

knowledge graph mining