Ruokai Yin
Scholar

Ruokai Yin

Google Scholar ID: MF1nKn4AAAAJ
Yale University
Computer ArchitectureDomain-specific AccelerationDeep LearningNeuromorphic Computing
Citations & Impact
All-time
Citations
408
 
H-index
11
 
i10-index
11
 
Publications
20
 
Co-authors
8
list available
Resume (English only)
Academic Achievements
  • Several papers have been accepted at top conferences such as NeurIPS, DAC, and MICRO, and nominated for best paper at ASP-DAC 2024. Specific papers include: DuoGPT: Training-free Dual Sparsity through Activation-aware Pruning in LLMs; Learning to Shard: RL for Co-optimizing the Parallelism Degrees and Per-operator Sharding Dimensions in Distributed LLM Inference; PacQ: A SIMT Microarchitecture for Efficient Dataflow in Hyper-asymmetric GEMMs; LoAS: Fully Temporal-Parallel Dataflow for Dual-Sparse Spiking Neural Networks; MINT: Multiplier-less INTeger Quantization for Energy Efficient Spiking Neural Networks; Workload-Balanced Pruning for Sparse Spiking Neural Networks; Wearable-based Human Activity Recognition with Spatio-Temporal Spiking Neural Networks; SATA: Sparsity-Aware Training Accelerator for Spiking Neural Networks.
Research Experience
  • He has interned at Microsoft Azure, working with the AI System Architecture team, and at Cerebras Systems, working with the ASIC team.
Education
  • He is currently a final year Ph.D. student in the Department of Electrical Engineering at Yale University, advised by Prof. Priyadarshini Panda. Prior to joining Yale, he earned his B.S. from the University of Wisconsin-Madison, majoring in Electrical Engineering, Computer Science, and Mathematics. During his undergraduate, he worked with Prof. Joshua San Miguel on designing computer architectures for stochastic computing.
Background
  • His research focuses on designing energy-efficient computer architectures, systems, and algorithms for AI workloads, particularly those involving asymmetric operand precision or sparsity. He is also interested in neuromorphic computing, as enablers for bio-plausible and energy-efficient deep learning (spiking neural networks).