🤖 AI Summary
This work addresses the inefficiency of large reasoning models in enterprise tasks caused by excessive or redundant reasoning. To mitigate this, the authors propose Yuan3.0 Flash, an open-source multimodal large language model built upon a Mixture-of-Experts architecture with 40 billion total parameters and 3.7 billion activated per inference. The model incorporates a novel reinforcement learning algorithm, Reflection-aware Adaptive Policy Optimization (RAPO), which dynamically regulates the reasoning process to suppress unnecessary computation. With only one-fourth to one-half of the average token usage, Yuan3.0 Flash achieves state-of-the-art performance on enterprise-oriented tasks such as retrieval-augmented generation, complex table understanding, and summarization, while maintaining competitive accuracy on mathematical and scientific reasoning benchmarks—effectively balancing computational efficiency with broad generalization capability.
📝 Abstract
We introduce Yuan3.0 Flash, an open-source Mixture-of-Experts (MoE) MultiModal Large Language Model featuring 3.7B activated parameters and 40B total parameters, specifically designed to enhance performance on enterprise-oriented tasks while maintaining competitive capabilities on general-purpose tasks. To address the overthinking phenomenon commonly observed in Large Reasoning Models (LRMs), we propose Reflection-aware Adaptive Policy Optimization (RAPO), a novel RL training algorithm that effectively regulates overthinking behaviors. In enterprise-oriented tasks such as retrieval-augmented generation (RAG), complex table understanding, and summarization, Yuan3.0 Flash consistently achieves superior performance. Moreover, it also demonstrates strong reasoning capabilities in domains such as mathematics, science, etc., attaining accuracy comparable to frontier model while requiring only approximately 1/4 to 1/2 of the average tokens. Yuan3.0 Flash has been fully open-sourced to facilitate further research and real-world deployment: https://github.com/Yuan-lab-LLM/Yuan3.0.