Browse publications on Google Scholar (top-right) ↗
Resume (English only)
Academic Achievements
Published papers in multiple international conferences in 2016, including 'On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima', 'Sparso: Context-driven Optimizations of Sparse Linear Algebra', and 'High Performance Emulation of Quantum Circuits'. Received the Intel Achievement Award in 2012 for his outstanding DGEMM implementation.
Research Experience
Conducted research in areas such as medical imaging, computational finance, and high-performance compute kernels like DGEMM, SpMVM, and QCD, which helped improve MIC architecture and demonstrate its full performance potential. Served as the technical lead behind top-ranked positions in Green500 and TOP500 for MIC-based systems.
Education
Ph.D. from the Department of Electrical Engineering and Computer Science at the University of Michigan, Ann Arbor in 2003, under the supervision of Professors Edward Davidson and Scott Mahlke. The focus of his thesis was on hardware/software co-design and compiler optimizations for efficient resource utilization on VLIW architectures.
Background
Principal Engineer at Intel's Parallel Computing Lab, focusing on application-driven parallel architecture research. His work involves the design, implementation, and analysis (including competitive analysis) of parallel algorithms and workloads for current and future generation parallel processor systems. He has made significant contributions to the definition of Intel® Many-Integrated Core (MIC) architecture and the development of the Intel® Xeon Phi™ coprocessor.