Publications: BitNet accepted as a regular paper by JMLR 2025; BitVLA, the first 1-bit VLA model for robotics manipulation and multimodal tasks; BitNet v2, native 4-bit activations for 1-bit LLMs; BitNet b1.58 2B4T, the first native 1-bit LLM trained at scale; BitNet a4.8, enabling 4-bit activations for 1-bit LLMs; bitnet.cpp, the official inference framework; Q-Sparse, the fully Sparsely-Activated LLM; DeepNet accepted as a regular paper by TPAMI 2024.
Research Experience
Research intern at General Artificial Intelligence group (GenAI), MSR-Asia, under the supervision of Dr. Furu Wei and Shuming Ma from Aug. 2021 to June 2025.
Education
Received B.Eng. degree from the Department of Computer Science and Technology, University of Science and Technology of China (USTC), advised by Professor Chao Qian; now a Ph.D. student at CAS, supervised by Professor Xilin Chen.
Background
Research Interests: Efficient architecture for large-scale foundation models, multimodal reasoning, robotics. Currently a Ph.D. candidate at the Institute of Computing Technology, Chinese Academy of Sciences (CAS).