🤖 AI Summary
To address distorted latent-space geometry and poor interpolation smoothness in autoencoders—caused by suboptimal bottleneck dimension selection and insufficient regularization—this paper proposes the Rank-Reduced Autoencoder (RRAE). RRAE jointly enlarges the latent dimension while enforcing minimal-rank linear representation via two complementary mechanisms: truncated singular value decomposition (strong constraint) and a low-rank regularization loss (weak constraint). Inspired by Koopman operator theory, the framework integrates nonlinear disentangled modeling with low-rank matrix optimization. Experiments on synthetic datasets and MNIST demonstrate that RRAE significantly outperforms standard autoencoders (AE), variational autoencoders (VAE), and kernel PCA (kPCA). It achieves superior reconstruction accuracy, smoother manifold interpolation, enhanced latent-space connectivity (eliminating “holes”), and improved generalization capability.
📝 Abstract
The efficiency of classical Autoencoders (AEs) is limited in many practical situations. When the latent space is reduced through autoencoders, feature extraction becomes possible. However, overfitting is a common issue, leading to ``holes'' in AEs' interpolation capabilities. On the other hand, increasing the latent dimension results in a better approximation with fewer non-linearly coupled features (e.g., Koopman theory or kPCA), but it doesn't necessarily lead to dimensionality reduction, which makes feature extraction problematic. As a result, interpolating using Autoencoders gets harder. In this work, we introduce the Rank Reduction Autoencoder (RRAE), an autoencoder with an enlarged latent space, which is constrained to have a small pre-specified number of dominant singular values (i.e., low-rank). The latent space of RRAEs is large enough to enable accurate predictions while enabling feature extraction. As a result, the proposed autoencoder features a minimal rank linear latent space. To achieve what's proposed, two formulations are presented, a strong and a weak one, that build a reduced basis accurately representing the latent space. The first formulation consists of a truncated SVD in the latent space, while the second one adds a penalty term to the loss function. We show the efficiency of our formulations by using them for interpolation tasks and comparing the results to other autoencoders on both synthetic data and MNIST.