Published multiple papers such as 'Spark Transformer: Reactivating Sparsity in FFN and Attention' (NeurIPS 2025), 'HiRE: High Recall Approximate Top-k Estimation for Efficient LLM Inference' (Arxiv), etc. Involved in projects like BlockRank and Gemma3n.
Research Experience
Currently a Staff Research Scientist at Google DeepMind in NYC. Previously, was a postdoc at UC Berkeley working with Prof. Yi Ma.
Education
Received a PhD in ECE from Johns Hopkins University in 2018, advised by Prof. René Vidal. Prior to that, obtained B.S. and M.S. degrees from Peking University.
Background
Research interests include sparsity, information retrieval, foundation models, and their intersections. Broadly works in the areas of machine learning, computer vision, optimization, and signal processing.