Scholar

Xinyin Ma

Google Scholar ID: jFUKS0oAAAAJ

National University of Singapore

Efficient Deep LearningLarge Language ModelDiffusion Model

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

2,446

H-index

i10-index

Publications

Co-authors

list available

Contact

Emailmaxinyin@u.nus.edu CVOpen ↗TwitterOpen ↗GitHubOpen ↗

Publications

18 items

Q-ARVD: Quantizing Autoregressive Video Diffusion Models

2026

Cited

Mix-Quant: Quantized Prefilling, Precise Decoding for Agentic LLMs

2026

Cited

DMax: Aggressive Parallel Decoding for dLLMs

2026

Cited

Invisible Safety Threat: Malicious Finetuning for LLM via Steganography

2026

Cited

Self-Purification Mitigates Backdoors in Multimodal Diffusion Language Models

2026

Cited

dVoting: Fast Voting for dLLMs

2026

Cited

In-Video Instructions: Visual Signals as Generative Control

2025

Cited

MixReasoning: Switching Modes to Think

2025

Cited

Resume (English only)

Academic Achievements

September 2025: Three papers (dKV-Cache, Thinkless, and VeriThinker) accepted by NeurIPS'25 and one survey accepted by TMLR; August 2025: Started the internship at NVIDIA; February 2025: Co-organized the 2nd workshop on Efficient Large Vision Models, CVPR'25; January 2025: Awarded KAUST Rising Stars 2025; November 2024: Awarded Google PhD Fellowship; December 2023: New work DeepCache accelerates Diffusion Models; June 2023: Released LLM-Pruner, the first structural pruning work of LLM.

Research Experience

Working as a PhD research intern at NVIDIA Fundamental Generative AI Research (GenAIR) group.

Education

Obtained master's and bachelor's degrees in computer science from Zhejiang University, advised by Prof. Weiming Lu; currently pursuing a Ph.D. at the National University of Singapore, supervised by Prof. Xinchao Wang.

Background

Currently a final-year Ph.D. student at xML-Lab, National University of Singapore, under the supervision of Prof. Xinchao Wang. Research interests include efficiency research for generative models, such as large language models, reasoning models, and diffusion language models; image and video diffusion models; data-centric compression techniques, etc.

Miscellany