π€ AI Summary
This work addresses the scalability limitations of traditional self-organizing maps (SOMs), which are constrained by the memory capacity of a single GPU and thus struggle with large-scale datasets. To overcome these constraints, the authors propose an efficient SOM framework that supports distributed training across multiple GPUs, disk-backed streaming data loading, flexible irregular topologies, and topology-aware automated hyperparameter tuning. The proposed method effectively breaks through the scale and structural limitations of conventional SOMs, achieving lower quantization errors than existing approaches on 14 benchmark datasets. In the largest experiment, a 1024-node SOM was trained on one billion samples with 50-dimensional features in just 6.16 minutes using eight GPUs.
π Abstract
GPU-accelerated Self-Organizing Map (SOM) implementations are among the most competitive options for large-scale SOM analysis, but growing dataset sizes increasingly challenge their practical use because workloads no longer fit cleanly within device-memory limits. We introduce FloatSOM, a SOM framework for scalable training and deployment that supports multi-GPU execution, out-of-memory disk-backed streaming, and novel topologies beyond regular lattices. We evaluate FloatSOM on 14 synthetic and real benchmark datasets together with controlled speed scaling benchmarks, and show that these improved topologies, combined with topology-aware hyperparameter fine-tuning, yield lower quantization error than current state-of-the-art SOM baselines. FloatSOM also sustains this performance at large scale with high-throughput distributed execution; in the largest benchmark, it trains a 1024-node SOM network on 1,000,000,000 samples with 50 features in 6.16 minutes on 8 GPUs across two separate high-performance-computing nodes.