Scholar
Fuxiao Liu
Google Scholar ID: e0P54E4AAAAJ
Research Scientist, NVIDIA
Multi-Modal Learning
MLLM
Hallucination
Follow
Homepage
↗
Google Scholar
↗
Citations & Impact
All-time
Citations
2,395
H-index
14
i10-index
15
Publications
20
Co-authors
17
list available
Contact
Twitter
Open ↗
GitHub
Open ↗
LinkedIn
Open ↗
Publications
13 items
MM-Zero: Self-Evolving Multi-Model Vision Language Models From Zero Data
2026
Cited
0
Towards Multimodal Lifelong Understanding: A Dataset and Agentic Baseline
2026
Cited
0
First Frame Is the Place to Go for Video Content Customization
2025
Cited
0
NVIDIA Nemotron Nano V2 VL
2025
Cited
0
Self-Rewarding Vision-Language Model via Reasoning Decomposition
2025
Cited
0
ColorBench: Can VLMs See and Understand the Colorful World? A Comprehensive Benchmark for Color Perception, Reasoning, and Robustness
2025
Cited
0
Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models
2025
Cited
0
AIDE: Agentically Improve Visual Language Model with Domain Experts
2025
Cited
0
Load more
Resume (English only)
Academic Achievements
EAGLE: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders, ICLR 2025.
Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning, ICLR 2024.
HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark, CVPR 2024.
MMC: Advancing Multimodal Chart Understanding with Large-scale Instruction Tuning, NAACL 2024.
DocumentCLIP: Linking Figures and Main Body Text in Reflowed Documents, ICPRAI 2023.
COVID-VTS: Fact Extraction and Verification on Short Video Platforms, EACL 2023.
Visual News: Benchmark and Challenges in News Image Captioning, EMNLP 2021.
Research Experience
Spring 2024, NVIDIA ADLR, with Guilin Liu and Zhiding Yu on building Large Multimodal Models.
Summer 2023, Tencent AI, with Xiaoyang Wang, Jianshu Chen, Kaiqiang Song, Wenlin Yao on Visual Chart Understanding.
Spring 2023, Microsoft Research, with Linjie Li, Kevin Lin, Jianfeng Wang on Robust Visual Instruction Tuning.
Summer 2022, Adobe Research, with Chris Tensmeyer, Hao Tan, Ani Nenkova on Visual Document Understanding.
Spring 2022, UMIACS, with Abhinav Shrivastava on Fact Checking on Short Video.
Spring 2021, UVa Vislang Lab, with Vicente Ordonez on News Image Captioning.
Education
Ph.D., University of Maryland, College Park, May 2025; Advisors: Abhinav Shrivastava, Yaser Yacoob, Tianyi Zhou, Furong Huang.
Background
Research Interests: Building customizable large models that follow humans' intent. Specialization: Computer Vision, Multimodal Learning.
Miscellany
Personal interests not provided.
Co-authors
17 total
Yaser Yacoob
University of Maryland, College Park
Tianyi Zhou
Unknown affiliation
Furong Huang
Associate Professor of Computer Science, University of Maryland
Linjie Li
Microsoft
Kevin Lin
Microsoft
Vicente Ordóñez
Rice University, Associate Professor of Computer Science
Andrew Tao
Nvidia
Zhiding Yu
Principal Research Scientist & Research Lead, NVIDIA Research
×
Welcome back
Sign in to Agora
Welcome back! Please sign in to continue.
Email address
Password
Forgot password?
Continue
Do not have an account?
Sign up