🤖 AI Summary
To address the CPU-bound computational bottleneck and poor scalability of NSGA-III in large-scale multi-objective optimization, this paper presents the first fully tensorized, GPU-accelerated implementation of NSGA-III. Methodologically, we design an end-to-end tensorized pipeline on CUDA, covering population evolution, non-dominated sorting, reference-point generation, and adaptive mutation, augmented with memory-aware caching and parallel reduction optimizations. Contributions include: (1) strict preservation of the original selection and mutation mechanisms, ensuring zero-precision loss; (2) up to 3629× speedup over CPU-based NSGA-III on standard benchmark suites; (3) empirical revelation that ten-thousand-scale populations significantly enhance convergence and diversity in high-dimensional objective spaces; and (4) successful deployment in multi-objective robotic control, yielding high-quality, highly diverse behavioral policies.
📝 Abstract
NSGA-III is one of the most widely adopted algorithms for tackling many-objective optimization problems. However, its CPU-based design severely limits scalability and computational efficiency. To address the limitations, we propose {TensorNSGA-III}, a fully tensorized implementation of NSGA-III that leverages GPU parallelism for large-scale many-objective optimization. Unlike conventional GPU-accelerated evolutionary algorithms that rely on heuristic approximations to improve efficiency, TensorNSGA-III maintains the exact selection and variation mechanisms of NSGA-III while achieving significant acceleration. By reformulating the selection process with tensorized data structures and an optimized caching strategy, our approach effectively eliminates computational bottlenecks inherent in traditional CPU-based and na""ive GPU implementations. Experimental results on widely used numerical benchmarks show that TensorNSGA-III achieves speedups of up to $3629 imes$ over the CPU version of NSGA-III. Additionally, we validate its effectiveness in multiobjective robotic control tasks, where it discovers diverse and high-quality behavioral solutions. Furthermore, we investigate the critical role of large population sizes in many-objective optimization and demonstrate the scalability of TensorNSGA-III in such scenarios. The source code is available at https://github.com/EMI-Group/evomo