An unsupervised tour through the hidden pathways of deep neural networks

📅 2025-10-24

📈 Citations: 0

✨ Influential: 0

career value

177K/year

🤖 AI Summary

This work investigates the semantic evolution mechanisms and generalization origins of internal representations in deep neural networks (DNNs). We propose Gride—a nonparametric intrinsic dimension estimation framework based on nearest-neighbor distances—that robustly quantifies both the intrinsic dimensionality and uncertainty of hidden-layer representations. Integrating density-peak clustering with unsupervised topological analysis, we demonstrate that semantic hierarchies spontaneously emerge via density-based stratification. Our findings reveal that wide networks, even at zero training error, achieve regularization-driven generalization by leveraging redundant neurons to form low-dimensional, dense representation clusters. Furthermore, semantic structures across layers are fully recoverable in an unsupervised manner, and the topological relationships among output-class manifolds are precisely reconstructed. These results provide an interpretable, empirically verifiable theoretical foundation and analytical toolkit for understanding representation learning and generalization in DNNs.

Technology Category

Application Category

📝 Abstract

The goal of this thesis is to improve our understanding of the internal mechanisms by which deep artificial neural networks create meaningful representations and are able to generalize. We focus on the challenge of characterizing the semantic content of the hidden representations with unsupervised learning tools, partially developed by us and described in this thesis, which allow harnessing the low-dimensional structure of the data. Chapter 2. introduces Gride, a method that allows estimating the intrinsic dimension of the data as an explicit function of the scale without performing any decimation of the data set. Our approach is based on rigorous distributional results that enable the quantification of uncertainty of the estimates. Moreover, our method is simple and computationally efficient since it relies only on the distances among nearest data points. In Chapter 3, we study the evolution of the probability density across the hidden layers in some state-of-the-art deep neural networks. We find that the initial layers generate a unimodal probability density getting rid of any structure irrelevant to classification. In subsequent layers, density peaks arise in a hierarchical fashion that mirrors the semantic hierarchy of the concepts. This process leaves a footprint in the probability density of the output layer, where the topography of the peaks allows reconstructing the semantic relationships of the categories. In Chapter 4, we study the problem of generalization in deep neural networks: adding parameters to a network that interpolates its training data will typically improve its generalization performance, at odds with the classical bias-variance trade-off. We show that wide neural networks learn redundant representations instead of overfitting to spurious correlation and that redundant neurons appear only if the network is regularized and the training error is zero.

Problem

Research questions and friction points this paper is trying to address.

Characterizing semantic content of hidden representations with unsupervised tools

Analyzing probability density evolution across neural network layers

Explaining generalization improvement in over-parameterized neural networks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Estimates intrinsic data dimension using nearest neighbor distances

Analyzes probability density evolution across hidden layers

Shows wide networks learn redundant representations when regularized

🔎 Similar Papers

No similar papers found.