Flash Attention is an IO-aware attention algorithm widely adopted, including in MLPerf benchmarks.
S4 achieved state-of-the-art results on the Long Range Arena and was the first to solve Path-X.
Evo project featured on the cover of Science magazine.
Received the NeurIPS Test-of-Time Award for Hogwild!.
Developed and deployed high-impact frameworks like Overton and Snorkel, used in real-world applications.
Delivered keynote talks at top venues including NeurIPS 2023 and SIGMOD on Data-Centric AI and Foundation Models.
Research Experience
Leads a research lab at Stanford focused on the foundations of next-generation AI systems.
Led development of influential projects including Snorkel, Overton (built at Apple), Together, Flash Attention, ThunderKittens, S4, Hyena, HyenaDNA, and Evo.
Collaborates closely with industry partners such as Google Ads, YouTube, and Apple.
Pioneers research on long-sequence modeling (e.g., S4, Hyena family) and building blocks for foundation models.
Co-founded multiple companies and a venture firm; actively mentors students toward academic and entrepreneurial careers.