🤖 AI Summary
The internal structure and stabilization mechanisms of concepts in deep neural networks remain poorly understood. To address this, we propose Concept Trees—a novel framework that models concept evolution as a causally driven hierarchical tree structure. Leveraging hierarchical spectral decomposition and principal direction tracking, our method identifies semantic bifurcation points within shared representations and constructs dynamic Concept Paths, enabling linearly separable, hierarchical concept disentanglement—without label supervision. The framework supports cross-domain concept localization and interpretation. Empirically, it successfully recovers verifiable semantic hierarchies in medical diagnosis, physical reasoning, and political decision-making tasks, significantly enhancing model interpretability and concept stability. Our core contribution lies in unifying causal inference with spectral analysis to yield a computationally tractable and generalizable theoretical framework for characterizing concept organization in deep models.
📝 Abstract
Large-scale foundation models demonstrate strong performance across language, vision, and reasoning tasks. However, how they internally structure and stabilize concepts remains elusive. Inspired by causal inference, we introduce the MindCraft framework built upon Concept Trees. By applying spectral decomposition at each layer and linking principal directions into branching Concept Paths, Concept Trees reconstruct the hierarchical emergence of concepts, revealing exactly when they diverge from shared representations into linearly separable subspaces. Empirical evaluations across diverse scenarios across disciplines, including medical diagnosis, physics reasoning, and political decision-making, show that Concept Trees recover semantic hierarchies, disentangle latent concepts, and can be widely applied across multiple domains. The Concept Tree establishes a widely applicable and powerful framework that enables in-depth analysis of conceptual representations in deep models, marking a significant step forward in the foundation of interpretable AI.