Shiyao Li (李师尧)
Scholar

Shiyao Li (李师尧)

Google Scholar ID: JWaexW0AAAAJ
Ph.D student, Tsinghua University
Large Language ModelQuantization
Citations & Impact
All-time
Citations
644
 
H-index
8
 
i10-index
7
 
Publications
18
 
Co-authors
9
list available
Publications
18 items
Browse publications on Google Scholar (top-right) ↗
Resume (English only)
Academic Achievements
  • Publications:
  • - [IEEE TCSVT] Toward High-accuracy and Real-time Two-stage Small Object Detection on FPGA
  • - [ACM TRETS] A Unified FPGA Virtualization Framework for General-Purpose Deep Neural Networks in the Cloud
  • - [CoLM25] Mixture of Attention Spans: Optimizing LLM Inference Efficiency with Heterogeneous Sliding-Window Lengths
  • - [CVPR25] MBQ: Modality-Balanced Quantization for Large Vision-Language Models
  • - [ICLR25] ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation
  • - [FPGA24] FlightLLM: Efficient Large Language Model Inference with a Complete Mapping Flow on FPGAs
  • - [ICCAD24] Towards Floating Point-Based Attention-Free LLM: Hybrid PIM with Non-Uniform Data Format and Reduced Multiplications
  • - [ICML24] Evaluating Quantized Large Language Models
  • - [NeurIPS24] Can LLMs Learn by Teaching? A Preliminary Study
  • - [WACV24] TCP: Triplet Contrastive-relationship Preserving for Class-Incremental Learning
Research Experience
  • Work Experience: Not provided
  • Research Projects:
  • - Toward High-accuracy and Real-time Two-stage Small Object Detection on FPGA
  • - A Unified FPGA Virtualization Framework for General-Purpose Deep Neural Networks in the Cloud
  • - Mixture of Attention Spans: Optimizing LLM Inference Efficiency with Heterogeneous Sliding-Window Lengths
  • - MBQ: Modality-Balanced Quantization for Large Vision-Language Models
  • - ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation
  • - FlightLLM: Efficient Large Language Model Inference with a Complete Mapping Flow on FPGAs
  • - Towards Floating Point-Based Attention-Free LLM: Hybrid PIM with Non-Uniform Data Format and Reduced Multiplications
  • - Evaluating Quantized Large Language Models
  • - Can LLMs Learn by Teaching? A Preliminary Study
  • - TCP: Triplet Contrastive-relationship Preserving for Class-Incremental Learning
Education
  • Degree: Not provided
  • University: Tsinghua University
  • Advisor: Not provided
  • Time: Not provided
  • Major: Not provided
Background
  • Research Interests: Large Language Models, Quantization
  • Fields: Efficient Deep Learning Algorithms, Multi-agent Reinforcement Learning Algorithms, Domain Specific Acceleration, Multi-agent Systems
Miscellany
  • Personal Interests: Not provided