Zefan Cai
Scholar

Zefan Cai

Google Scholar ID: eyIrttAAAAAJ
Student, Peking University
Inference AccelerationMulti-Modality
Citations & Impact
All-time
Citations
1,601
 
H-index
15
 
i10-index
21
 
Publications
20
 
Co-authors
16
list available
Resume (English only)
Background
  • Research interests include LLM efficiency (e.g., KV cache retrieval, off-loading, compression, and other inference optimization topics), LLM architecture (e.g., native sparse attention, test-time training), and LLM memorization (e.g., parametric memory, agent memory). He considers long-context as the most important problem in LLM.