Scholar

Tri Dao

Google Scholar ID: NQRw0bQAAAAJ

Princeton University, Together AI

Machine learningSystems

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

18,681

H-index

i10-index

Publications

Co-authors

144

list available

Contact

CVOpen ↗TwitterOpen ↗GitHubOpen ↗LinkedInOpen ↗

Publications

23 items

CODA: Rewriting Transformer Blocks as GEMM-Epilogue Programs

2026

Cited

Search Your Block Floating Point Scales!

2026

Cited

SAW-INT4: System-Aware 4-Bit KV-Cache Quantization for Real-World LLM Serving

2026

Cited

Introspective Diffusion Language Models

2026

Cited

Squeeze Evolve: Unified Multi-Model Orchestration for Verifier-Free Evolution

2026

Cited

Mamba-3: Improved Sequence Modeling using State Space Principles

2026

Cited

M$^2$RNN: Non-Linear RNNs with Matrix-Valued States for Scalable Language Modeling

2026

Cited

FlashAttention-4: Algorithm and Kernel Pipelining Co-Design for Asymmetric Hardware Scaling

2026

Cited

Resume (English only)

Academic Achievements

Publications:
- Marconi: Prefix Caching for the Era of Hybrid LLMs
- FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precision
- Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality
- Mamba: Linear-Time Sequence Modeling with Selective State Spaces
- FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
- Monarch: Expressive Structured Matrices for Efficient and Accurate Training
Awards:
- Best Paper award at the ICML Hardware Aware Efficient Training Workshop 2022
- Inaugural Stanford Open Source Software Prize 2024

Research Experience

Current PhD Students: Ted Zadouri, Berlin Chen, Wentao Guo, Xinle Cheng (co-advised with Ravi Netravali), Lijie Yang (co-advised with Ravi Netravali), Liane Ganti (co-advised with Elad Hazan).

Education

PhD, Department of Computer Science, Stanford University

Background

Research Interests: Machine learning and systems, with a focus on efficient training and inference, hardware-aware algorithms, sequence models with long-range memory. Bio: Assistant Professor of Computer Science at Princeton University, Co-founder & Chief Scientist of Together AI.

Miscellany