Scholar

Zhexiao Xiong

Google Scholar ID: OQGjvAQAAAAJ

Washington University in St. Louis

Computer VisionGenerative ModelsAIGCVLM

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

H-index

i10-index

Publications

Co-authors

list available

Contact

Emailx.zhexiao@wustl.edu CVOpen ↗GitHubOpen ↗LinkedInOpen ↗

Publications

10 items

GenOpticalFlow: A Generative Approach to Unsupervised Optical Flow Learning

2026

Cited

Reconstruction Matters: Learning Geometry-Aligned BEV Representation through 3D Gaussian Splatting

2026

Cited

PhysAlign: Physics-Coherent Image-to-Video Generation through Feature and 3D Representation Alignment

2026

Cited

UniDrive-WM: Unified Understanding, Planning and Generation World Model For Autonomous Driving

arXiv.org · 2026

Cited

PanoDreamer: Consistent Text to 360-Degree Scene Generation

2025

Cited

DeclutterNeRF: Generative-Free 3D Scene Recovery for Occlusion Removal

2025

Cited

GenStereo: Towards Open-World Generation of Stereo Images and Unsupervised Matching

2025

Cited

GroundingBooth: Grounding Text-to-Image Customization

arXiv.org · 2024

Cited

Resume (English only)

Academic Achievements

GroundingBooth: Introduced a framework for text-to-image customization that achieves zero-shot instance-level spatial grounding on both foreground subjects and background objects. Mixed-View Panorama Synthesis using Geospatially Guided Diffusion: Proposed a method for mixed-view panorama synthesis using geospatially guided diffusion.

Research Experience

Internship at Bosch Research, developed a world-model-based framework that unifies trajectory planning and autoregressive future image generation, enhanced with Chain-of-Thought reasoning within a single vision-language model (VLM). Also involved in an ongoing project proposing a framework that leverages vision-language models' physics understanding to enable video generation with physically consistent motion and accurate 3D dynamics.

Education

Ph.D. candidate in Computer Science at Washington University in St. Louis, advised by Prof. Nathan Jacobs. Bachelor's degree in Electrical and Information Engineering from Tianjin University. Worked at the Institute of Automation, Chinese Academy of Sciences (CASIA), collaborating with Prof. Jinqiao Wang and Dr. Xu Zhao.

Background

Research interests: Computer vision and multi-modal learning, especially generative models and AIGC-related topics. Specifically: (1) Unifying vision understanding and generation, including world models for applications such as autonomous driving; (2) Controllable & personalized image/video generation and editing; (3) Integration of vision-language models with generative modeling; (4) Generative AI for 3D vision, including neural rendering, cross-view synthesis, and novel view synthesis. Also interested in geometric computer vision and its combination with generative models.

Miscellany