Scholar

Sixun Dong

Google Scholar ID: j71Y2-4AAAAJ

Arizona State University

Computer VisonMultimodal LearningVisual Language Model

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

201

H-index

i10-index

Publications

Co-authors

list available

Contact

Emailsixundong.ai@gmail.com TwitterOpen ↗GitHubOpen ↗LinkedInOpen ↗

Publications

16 items

Rethinking Model Efficiency: Multi-Agent Inference with Large Models

2026

Cited

To Think or Not To Think, That is The Question for Large Reasoning Models in Theory of Mind Tasks

2026

Cited

Towards Robust Dysarthric Speech Recognition: LLM-Agent Post-ASR Correction Beyond WER

2026

Cited

MMTok: Multimodal Coverage Maximization for Efficient Inference of VLMs

2025

Cited

LiveMCP-101: Stress Testing and Diagnosing MCP-enabled Agents on Challenging Queries

2025

Cited

Complex Logical Instruction Generation

2025

Cited

LLM-ML Teaming: Integrated Symbolic Decoding and Gradient Search for Valid and Stable Generative Feature Transformation

2025

Cited

Efficient Post-Training Refinement of Latent Reasoning in Large Language Models

2025

Cited

Resume (English only)

Academic Achievements

Paper published: 'Feature Transformation by Semi-AR and reward-guided diffusion' accepted to NeurIPS 2025.
Paper published: 'MMTok: Multimodal Coverage Maximization for Efficient VLM Inference' launched on arXiv.
Paper published: 'LiveMCP-101: a new benchmark testing AI agents’ real-world tool-use' released on arXiv.
Paper published: 'LogicIF: Complex Logical Instruction Generation' released on arXiv.
Comprehensive blog post published: 'TimesCLIP: our multimodal approach to time series forecasting with CLIP'.
Paper published: 'MLLM-Tool' accepted to WACV 2024.
Paper published: 'WeakSVR' accepted to CVPR 2023.
Paper published: 'TransRAC' accepted as oral presentation to CVPR 2022.

Education

Master's Degree: ShanghaiTech University; Advisor: Professor Shenghua Gao

Background

Research Interests: Multimodal Learning, VLM, LLM Agent; Professional Field: Computer Vision, Natural Language Processing, and Machine Learning; Brief Introduction: Currently an independent researcher.

Miscellany