FuseO1-Preview: System-II Reasoning Fusion of LLMs (Tech Report, 2025) — Achieves 74.0 Pass@1 on AIME24, outperforming OpenAI o1-preview (44.6) and o1-mini (63.4)
FuseChat-3.0: Preference Optimization Meets Heterogeneous Model Fusion (ICLR SCI-FM Workshop, 2025) — Gains of 37.1 and 30.1 points on AlpacaEval-2 and Arena-Hard respectively
FuseRL: Dense Preference Optimization for Heterogeneous Model Fusion (Tech Report, 2025) — State-of-the-art among 8B LLMs on AlpacaEval-2 and Arena-Hard
Weighted-Reward Preference Optimization for Implicit Model Fusion (ICLR, 2025) — Achieves 55.9% LC Win Rate vs GPT-4-Preview-1106 on AlpacaEval-2 and 46.2% vs GPT-4-0314 on Arena-Hard
FuseChat: Knowledge Fusion of Chat Models (Tech Report, 2024) — FuseChat-7B comparable to Mixtral-8x7B-Instruct and approaches GPT-3.5-Turbo-1106 on MT-Bench
ProFuser: Progressive Fusion of Large Language Models (Tech Report, 2024) — Novel approach integrating training and inference modes for enhanced fusion