Jiecao Yu
Scholar

Jiecao Yu

Google Scholar ID: uxL6snYAAAAJ
Research Scientist, Facebook
Computer ArchitectureMachine Learning
Citations & Impact
All-time
Citations
11,135
 
H-index
10
 
i10-index
10
 
Publications
19
 
Co-authors
0
 
Resume (English only)
Academic Achievements
  • - Fast and Simplex: 2-Simplicial Attention in Triton
  • - Scaling Llama 3 Training with Efficient Parallelism
  • - Accelerating Transformer Inference and Training with 2:4 Activation Sparsity
  • - The Llama 3 Herd of Models
  • - BitSET: Bit-Serial Early Termination for Computation Reduction in Convolutional Neural Networks
  • - First-Generation Inference Accelerator Deployment at Facebook
  • - Compute-Capable Block RAMs for Efficient Deep Learning Acceleration on FPGAs
  • - Alternate Model Growth and Pruning for Efficient Training of Recommendation Systems
  • - Adaptive Dense-to-Sparse Paradigm for Pruning Online Recommendation System with Non-Stationary Data
Research Experience
  • - Meta Platforms, Inc.
  • Senior Staff Research Scientist, 08/2025 - Present
  • Staff Research Scientist, 02/2023 - 08/2025
  • Senior Research Scientist, 08/2021 - 02/2023
  • Research Scientist, 10/2019 - 08/2021
  • Menlo Park, CA
  • - Facebook, Inc.
  • Research Intern, 05/2018 - 08/2018, Menlo Park, CA
  • - ARM Inc.
  • Research Intern, 05/2017 - 07/2017, Austin, TX
  • Research Intern, 06/2016 - 08/2016, Austin, TX
Education
  • - Ph.D. Candidate, Computer Science & Engineering, 08/2014 - 09/2019
  • Advisor: Prof. Scott Mahlke
  • University of Michigan, Ann Arbor, MI
  • - M.S. Computer Science & Engineering, 08/2014 - 12/2015
  • - B.Eng. Electronic & Information Engineering, 08/2010 - 06/2014
  • Honored Minor, Advanced Honor Class of Engineering Education (ACEE)
  • Zhejiang University, Hangzhou, China
Background
  • Currently a senior staff research scientist at Meta Platforms, Inc. His current work focuses on optimizing LLM training and inference efficiency, especially low-precision (e.g., FP8) training/inference, efficiency-aware model arch design, distillation, and pruning. He was the main contributor developing the FP8 training framework for Llama 4.
Miscellany
  • Google Scholar / LinkedIn / Github
Co-authors
0 total
Co-authors: 0 (list not available)