MedXIAOHE: A Comprehensive Recipe for Building Medical MLLMs

📅 2026-02-13
📈 Citations: 0
Influential: 0
📄 PDF

Technology Category

Application Category

📝 Abstract
We present MedXIAOHE, a medical vision-language foundation model designed to advance general-purpose medical understanding and reasoning in real-world clinical applications. MedXIAOHE achieves state-of-the-art performance across diverse medical benchmarks and surpasses leading closed-source multimodal systems on multiple capabilities. To achieve this, we propose an entity-aware continual pretraining framework that organizes heterogeneous medical corpora to broaden knowledge coverage and reduce long-tail gaps (e.g., rare diseases). For medical expert-level reasoning and interaction, MedXIAOHE incorporates diverse medical reasoning patterns via reinforcement learning and tool-augmented agentic training, enabling multi-step diagnostic reasoning with verifiable decision traces. To improve reliability in real-world use, MedXIAOHE integrates user-preference rubrics, evidence-grounded reasoning, and low-hallucination long-form report generation, with improved adherence to medical instructions. We release this report to document our practical design choices, scaling insights, and evaluation framework, hoping to inspire further research.
Problem

Research questions and friction points this paper is trying to address.

medical multimodal LLMs
clinical reasoning
long-tail diseases
hallucination reduction
evidence-grounded generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

entity-aware continual pretraining
tool-augmented agentic training
evidence-grounded reasoning
low-hallucination report generation
medical vision-language foundation model
🔎 Similar Papers
No similar papers found.
B
Baorong Shi
ByteDance XiaoHe Medical AI
Bo Cui
Bo Cui
Eastern Institute of Technology, Ningbo
NanofabricationMEMSelectron beam and nanoimprint lithography
Boyuan Jiang
Boyuan Jiang
Tencent YouTu Lab; Zhejiang University
computer vision
D
Deli Yu
ByteDance XiaoHe Medical AI
F
Fang Qian
ByteDance XiaoHe Medical AI
H
Haihua Yang
ByteDance XiaoHe Medical AI
H
Huichao Wang
ByteDance XiaoHe Medical AI
J
Jiale Chen
ByteDance XiaoHe Medical AI
J
Jianfei Pan
ByteDance XiaoHe Medical AI
J
Jieqiong Cao
ByteDance XiaoHe Medical AI
J
Jinghao Lin
ByteDance XiaoHe Medical AI
Kai Wu
Kai Wu
ByteDance
MLLM: wukaiwork[At]gmail.com
L
Lin Yang
ByteDance XiaoHe Medical AI
S
Shengsheng Yao
ByteDance XiaoHe Medical AI
T
Tao Chen
ByteDance XiaoHe Medical AI
X
Xiaojun Xiao
ByteDance XiaoHe Medical AI
Xiaozhong Ji
Xiaozhong Ji
Nanjing University
computer visionimage processingsuper resolution
X
Xu Wang
ByteDance XiaoHe Medical AI
Y
Yijun He
ByteDance XiaoHe Medical AI
Z
Zhixiong Yang
ByteDance XiaoHe Medical AI