Scholar
Kuntai Du
Google Scholar ID: cY5PxOQAAAAJ
University of Chicago
Large Language Models
Video analytics
Follow
Homepage
↗
Google Scholar
↗
Citations & Impact
All-time
Citations
550
H-index
8
i10-index
7
Publications
20
Co-authors
0
Contact
No contact links provided.
Publications
9 items
EVICPRESS: Joint KV-Cache Compression and Eviction for Efficient LLM Serving
2025
Cited
0
LMCache: An Efficient KV Cache Layer for Enterprise-Scale LLM Inference
2025
Cited
0
AdaptCache: KV Cache Native Storage Hierarchy for Low-Delay and High-Quality Language Model Serving
2025
Cited
0
PrefillOnly: An Inference Engine for Prefill-only Workloads in Large Language Model Applications
2025
Cited
0
Jenga: Effective Memory Management for Serving LLM with Heterogeneity
2025
Cited
0
Towards More Economical Context-Augmented LLM Generation by Reusing Stored KV Cache
2025
Cited
0
RAGServe: Fast Quality-Aware RAG Systems with Configuration Adaptation
arXiv.org · 2024
Cited
7
DroidSpeak: KV Cache Sharing for Cross-LLM Communication and Multi-LLM Serving
2024
Cited
3
Load more
Resume (English only)
Co-authors
0 total
Co-authors: 0 (list not available)
×
Welcome back
Sign in to Agora
Welcome back! Please sign in to continue.
Email address
Password
Forgot password?
Continue
Do not have an account?
Sign up