🤖 AI Summary
To address the limitations in fine-grained semantic modeling and domain adaptability in financial sentiment analysis, this paper proposes MoMoE—a novel architecture integrating Mixture of Experts (MoE) with hierarchical multi-agent collaboration. Methodologically, MoMoE unifies expert routing across both neural network layers (embedding MoE in the final attention layer of each agent within LLaMA-3.1 8B) and inter-agent interaction layers, establishing cross-level specialized collaboration pathways. It achieves dual-level (structural and behavioral) coordination via task decomposition, iterative optimization, and hierarchical information fusion. Experimental results demonstrate that MoMoE significantly improves accuracy and stability across multiple financial sentiment analysis benchmarks, outperforming state-of-the-art monolithic models and conventional multi-agent approaches. These findings validate the efficacy of MoE-driven multi-agent collaborative paradigms for domain-specific language understanding.
📝 Abstract
We present a novel approach called Mixture of Mixture of Expert (MoMoE) that combines the strengths of Mixture-of-Experts (MoE) architectures with collaborative multi-agent frameworks. By modifying the LLaMA 3.1 8B architecture to incorporate MoE layers in each agent of a layered collaborative structure, we create an ensemble of specialized expert agents that iteratively refine their outputs. Each agent leverages an MoE layer in its final attention block, enabling efficient task decomposition while maintaining computational feasibility. This hybrid approach creates specialized pathways through both the model architecture and the agent collaboration layers. Experimental results demonstrate significant improvements across multiple language understanding and generation benchmarks, highlighting the synergistic benefits of combining expert routing at both the neural and agent levels.