1. [ICLR 2025 Oral] FlexPrefill: A Context-Aware Sparse Attention Mechanism for Efficient Long-Sequence Inference
2. Fira: Can We Achieve Full-rank Training of LLMs Under Low-rank Constraint?
3. Model Merging in Pre-training of Large Language Models
Open-Source Projects:
1. native-sparse-attention-triton
2. FlexPrefill
3. ring-sliding-window-attention
Research Experience
No detailed information available
Education
Master's student at the School of Intelligence Science and Technology, Peking University; undergraduate from Yuan Pei College, Peking University.
Background
Research Interests: Natural language processing and large language models; focusing on long context models, exploring innovative and efficient attention mechanisms, as well as optimizing the efficiency of model training and inference.