Publications: - ChunkKV: Semantic-Preserving KV Cache Compression for Efficient Long-Context LLM Inference (NeurIPS 2025) - Perovskite-LLM: Knowledge-Enhanced Large Language Models for Perovskite Solar Cell Research (EMNLP 2025 findings) - Smooth Reading: Bridging the Gap of Recurrent LLM to Self-Attention LLM on Long-Context Tasks (arxiv, 2025.08) - Intern-S1: A Scientific Multimodal Foundation Model (arxiv, 2025.08) - Can Compressed LLMs Truly Act? An Empirical Evaluation of Agentic Capabilities in LLM Compression (ICML25) - SpInfer: Leveraging Low-Level Sparsity for Efficient Large Language Model Inference on GPUs (EuroSys 2025, Best Paper) - STBLLM: Breaking the 1-Bit Barrier with Structured Binary LLMs (ICLR25) - The Lottery LLM Hypothesis (ICLR25 Blogpost Oral) - ParZC: Parametric Zero-Cost Proxies for Efficient NAS (AAA25 Oral) - FuseFL: One-Shot Federated Learning through the Lens of Causality with Progressive Layer Fusion (NeurIPS 2024 Spotlight) - Discovering Sparsity Allocation for Layer-wise Pruning of Large Language Models (NeurIPS 2024)
Awards: - Excellent Research Prize (2024 DSA Excellent Research Award) - Invited as an Area Chair in NeurIPS 2025
Other Achievements: - Gave a talk at PDL on 'Introduction to LLM Compression and Beyond'
Research Experience
Projects include exploring techniques for model compression (pruning, quantization, knowledge distillation), optimizing large language models, and developing methods for automated machine learning.
Education
Degree: PhD; School: The Hong Kong University of Science and Technology (Guangzhou); Field: Data Science and Analysis; Advisors: Prof. Xiaowen Chu, Prof. Junxian He.
Background
Research Interests: model compression, efficient large language models, and machine learning systems. Background: PhD candidate in Data Science and Analysis at The Hong Kong University of Science and Technology (Guangzhou), supervised by Prof. Xiaowen Chu and Prof. Junxian He.