Yushi SUN
Scholar

Yushi SUN

Google Scholar ID: j9ypp3QAAAAJ
PhD from HKUST
Data ManagementLLMRAGKGQA
Citations & Impact
All-time
Citations
210
 
H-index
5
 
i10-index
5
 
Publications
9
 
Co-authors
8
list available
Publications
9 items
Browse publications on Google Scholar (top-right) ↗
Resume (English only)
Academic Achievements
  • Published papers: CacheRAG: A Novel Approach to Enhance KG-based RAG Through Caching Mechanisms (VLDB under revision, 2026); LakeHopper: Cross Data Lakes Column Type Annotation through Model Adaptation (KDD under review, 2026); KERAG: Knowledge-Enhanced Retrieval-Augmented Generation for Advanced Question Answering (EMNLP Findings, 2025); CRAG - Comprehensive RAG Benchmark (NeurIPS, 2024); Are Large Language Models a Good Replacement of Taxonomies? (VLDB, 2024); Cross-domain-aware Worker Selection with Training for Crowdsourced Annotation (ICDE, 2024); RECA: Related Tables Enhanced Column Semantic Type Annotation Framework (VLDB, 2023). Awards: HKUST Research Travel Grant (2024); RedBird Academic Excellence Award for Continuing PhD Students (2023-2024); RedBird Academic Excellence Award for Continuing PhD Students (2022-2023); RedBird PhD Scholarship (2021); HKUST Academic Achievement Medal (2021); First Class Honor graduate from HKUST (2021); Hong Kong PhD Fellowship Scheme (2021-2025). Professional Services: Conference reviewer: CIKM 2023; Journal reviewer: TKDE 2024.
Research Experience
  • Currently a researcher at Tencent Games (Project UP, 青云计划); worked with Prof. Nan TANG and Dr. Xin Luna DONG during PhD.
Education
  • PhD in Computer Science and Engineering from Hong Kong University of Science and Technology (HKUST) in 2025, supervised by Prof. Lei CHEN; B.S. in Computer Science and Applied Mathematics from HKUST in 2021.
Background
  • Research interest: Data Management (Data Annotation, Data Integration, and Data Organization), LLMs, Agents, and RAG. Vision is to make data usable, from ensuring the annotation and preparation quality, to developing effective data organization paradigms, to improving usability of data in downstream applications.
Miscellany
  • Teaching Experience: Not provided in detail.