🤖 AI Summary
Variational autoencoder (VAE)-based neural topic models operating in hyperspherical latent spaces commonly suffer from posterior collapse—where the KL divergence term vanishes, rendering latent representations ineffective—and inadequately capture directional similarity in high-dimensional text embeddings.
Method: We propose a spherical autoencoder framework leveraging the Spherical Sliced-Wasserstein (SSW) distance. It employs the von Mises–Fisher (vMF) distribution as a spherical prior to explicitly model directional semantics of words and documents, and replaces the standard KL divergence with the SSW distance to robustly align the prior with the aggregated posterior on the hypersphere, thereby mitigating collapse.
Contribution/Results: Experiments on multiple benchmark datasets demonstrate substantial improvements in topic coherence and diversity, alongside enhanced performance on downstream tasks. Our approach establishes a novel paradigm for directional semantic modeling in neural topic modeling.
📝 Abstract
Modeling latent representations in a hyperspherical space has proven effective for capturing directional similarities in high-dimensional text data, benefiting topic modeling. Variational autoencoder-based neural topic models (VAE-NTMs) commonly adopt the von Mises-Fisher prior to encode hyperspherical structure. However, VAE-NTMs often suffer from posterior collapse, where the KL divergence term in the objective function highly diminishes, leading to ineffective latent representations. To mitigate this issue while modeling hyperspherical structure in the latent space, we propose the Spherical Sliced Wasserstein Autoencoder for Topic Modeling (S2WTM). S2WTM employs a prior distribution supported on the unit hypersphere and leverages the Spherical Sliced-Wasserstein distance to align the aggregated posterior distribution with the prior. Experimental results demonstrate that S2WTM outperforms state-of-the-art topic models, generating more coherent and diverse topics while improving performance on downstream tasks.