🤖 AI Summary
Traditional low-dimensional approaches struggle to capture the complex topological structure of neural network loss landscapes, limiting our understanding of optimization and generalization mechanisms. This work proposes Landscaper, an open-source tool that integrates Hessian-guided subspace sampling with topological data analysis (TDA) to enable geometric characterization of loss landscapes in arbitrary dimensions, revealing the hierarchy and connectivity of energy basins. The study introduces the Saddle-Minimum Average Distance (SMAD) metric, a novel indicator that, for the first time, leverages multidimensional topological analysis to detect landscape simplification during training. SMAD demonstrates sensitivity to training dynamics across diverse neural architectures and pretrained language models. Furthermore, in a molecular property prediction task, SMAD proves effective as a diagnostic indicator for out-of-distribution generalization.
📝 Abstract
Loss landscapes are a powerful tool for understanding neural network optimization and generalization, yet traditional low-dimensional analyses often miss complex topological features. We present Landscaper, an open-source Python package for arbitrary-dimensional loss landscape analysis. Landscaper combines Hessian-based subspace construction with topological data analysis to reveal geometric structures such as basin hierarchy and connectivity. A key component is the Saddle-Minimum Average Distance (SMAD) for quantifying landscape smoothness. We demonstrate Landscaper's effectiveness across various architectures and tasks, including those involving pre-trained language models, showing that SMAD captures training transitions, such as landscape simplification, that conventional metrics miss. We also illustrate Landscaper's performance in challenging chemical property prediction tasks, where SMAD can serve as a metric for out-of-distribution generalization, offering valuable insights for model diagnostics and architecture design in data-scarce scientific machine learning scenarios.