ICCV 2025: QuantCache: Adaptive Importance-Guided Quantization with Hierarchical Latent and Layer Caching for Video Generation
ICML 2025: BiMaCoSR: Binary One-Step Diffusion Model Leveraging Flexible Matrix Compression for Real Super-Resolution
ICLR 2025: ARB-LLM: Alternating Refined Binarizations for Large Language Models
ICLR 2025: GenDataAgent: On-the-fly Dataset Augmentation with Synthetic Data
arXiv 2025: DVD-Quant: Data-free Video Diffusion Transformers Quantization
arXiv 2025: AdaSVD: Adaptive Singular Value Decomposition for Large Language Models
arXiv 2025: BinaryHPE: 3D Human Pose and Shape Estimation via Binarization
arXiv 2025: ReCalKV: Low-Rank KV Cache Compression via Head Reordering and Offline Calibration
arXiv 2025: Progressive Binarization with Semi-Structured Pruning for LLMs
arXiv 2025: CondiQuant: Condition Number Based Low-Bit Quantization for Image Super-Resolution
arXiv 2025: Low-bit Model Quantization for Deep Neural Networks: A Survey
Research Experience
Interned at Amazon ASAIL, Sony AI, Xiaohongshu, and Bytedance Seed-LLM.
Education
Master's student, Department of Computer Science and Engineering, Shanghai Jiao Tong University, advised by Prof. Yulun Zhang and Prof. Linghe Kong; Received B.E. degree from SJTU in 2023.
Background
Research interests: LLM/VLM/DiT model compression and acceleration, including techniques such as binarization and post-training quantization; synthetic data augmentation and AI-generated content (AIGC), including text-to-image and text-to-video generative models.