🤖 AI Summary
Neural architecture search (NAS) for multi-branch deep neural networks suffers from prohibitive computational cost and complex structural optimization. Method: This paper proposes NeuroLGP-MB, a linear genetic programming–based encoding scheme for multi-branch architectures, integrated within a semantic-aware surrogate-assisted evolutionary framework. Contribution/Results: We innovatively introduce semantic similarity modeling to enhance surrogate generalization and design an efficient, scalable advanced surrogate model capable of rapidly evaluating thousands of candidate architectures. Compared with conventional surrogate models and baseline NAS methods, NeuroLGP-MB significantly reduces training overhead while successfully discovering high-accuracy multi-branch architectures in large-scale evolutionary search. It achieves an effective balance between search efficiency and final model performance.
📝 Abstract
State-of-the-art Deep Neural Networks (DNNs) often incorporate multi-branch connections, enabling multi-scale feature extraction and enhancing the capture of diverse features. This design improves network capacity and generalisation to unseen data. However, training such DNNs can be computationally expensive. The challenge is further exacerbated by the complexity of identifying optimal network architectures. To address this, we leverage Evolutionary Algorithms (EAs) to automatically discover high-performing architectures, a process commonly known as neuroevolution. We introduce a novel approach based on Linear Genetic Programming (LGP) to encode multi-branch (MB) connections within DNNs, referred to as NeuroLGP-MB. To efficiently design the DNNs, we use surrogate-assisted EAs. While their application in simple artificial neural networks has been influential, we scale their use from dozens or hundreds of sample points to thousands, aligning with the demands of complex DNNs by incorporating a semantic-based approach in our surrogate-assisted EA. Furthermore, we introduce a more advanced surrogate model that outperforms baseline, computationally expensive, and simpler surrogate models.