Scholar

Jiaming Tang

Google Scholar ID: lXLFEIAAAAAJ

Ph.D. student, MIT

Machine Learning System

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

1,799

H-index

i10-index

Publications

Co-authors

list available

Contact

Emailjmtang@mit.edu GitHubOpen ↗

Publications

8 items

MEM: Multi-Scale Embodied Memory for Vision Language Action Models

2026

Cited

AgentBay: A Hybrid Interaction Sandbox for Seamless Human-AI Intervention in Agentic Systems

2025

Cited

VLASH: Real-Time VLAs via Future-State-Aware Asynchronous Inference

2025

Cited

Accelerating Large-Scale Reasoning Model Inference with Sparse Self-Speculative Decoding

2025

Cited

SparseVILA: Decoupling Visual Sparsity for Efficient VLM Inference

2025

Cited

Transitive Array: An Efficient GEMM Accelerator with Result Reuse

2025

Cited

LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention

2025

Cited

Twilight: Adaptive Attention Sparsity with Hierarchical Top-$p$ Pruning

2025

Cited

Resume (English only)

Academic Achievements

Publications: Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inference (ICML 2024); AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration (MLSys 2024, Best Paper Award). Projects: AWQ has received over 2,048 stars on GitHub and is integrated into Transformers, vLLM, FastChat, TensorRT-LLM, and TGI.

Research Experience

Undergraduate researcher at SJTU EPCC Lab, advised by Prof. Jingwen Leng during junior year.

Education

B.Eng. in Computer Science from Shanghai Jiao Tong University (ACM Honors Class), advised by Prof. Jingwen Leng; Ph.D. student at MIT EECS, advised by Prof. Song Han.

Background