Scholar

Yake Wei

Google Scholar ID: i9mWGA0AAAAJ

Renmin University of China

multimodal learning

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

1,012

H-index

i10-index

Publications

Co-authors

Contact

Emailyakewei@ruc.edu.cn CVOpen ↗GitHubOpen ↗

Publications

7 items

Segmentation before Answering: Pixel Grounding for MLLM Visual Reasoning

2026

Cited

Information-Theoretic Decomposition for Multimodal Interaction Learning

2026

Cited

MIBench: Evaluating LMMs on Multimodal Interaction

2026

Cited

RollingQ: Reviving the Cooperation Dynamics in Multimodal Transformer

2025

Cited

MokA: Multimodal Low-Rank Adaptation for MLLMs

2025

Cited

Patch Matters: Training-free Fine-grained Image Caption Enhancement via Local Perception

2025

Cited

Adaptive Unimodal Regulation for Balanced Multimodal Information Acquisition

2025

Cited

Resume (English only)

Academic Achievements

Papers accepted by NeurIPS, ICML, CVPR, T-PAMI, ECCV, ICLR, and Pattern Recognition; awarded the Baidu Scholarship (one of 10 Ph.D. students worldwide), China National Scholarship for Ph.D. students (highest student honor in China), and Outstanding Graduate of Sichuan province; published a survey on audio-visual learning.

Research Experience

Visited the Human Sensing Lab at CMU; attended the VALSE 2025 student workshop; gave talks about balanced multimodal learning at Virginia Tech and TechBeat; released MokA, a new PEFT pipeline for MLLMs, ensuring both unimodal and cross-modal adaptation.

Education

Ph.D. student at the Gaoling School of Artificial Intelligence, Renmin University of China, advised by Prof. Di Hu; Bachelor's degree in Computer Science and Technology from the University of Electronic Science and Technology of China (UESTC) from 2017 to 2021.

Background

Research interests focus on multimodal learning, including multimodal learning mechanisms and MLLMs.

Miscellany

Has a WeChat discussion group about balanced multimodal learning.

Co-authors

0 total

Co-authors: 0 (list not available)