Scholar

Xianzheng Ma

Google Scholar ID: NS8g2mMAAAAJ

VGG & AVL, University of Oxford

3D Computer VisionEmbodied AILarge Language Models

Citations & Impact

All-time

Citations

1,233

H-index

i10-index

Publications

Co-authors

list available

Contact

Publications

6 items

2026

Cited

2026

Cited

2025

Cited

2025

Cited

arXiv.org · 2024

Cited

arXiv.org · 2024

Cited

Resume (English only)

Academic Achievements

Paper accepted to CVPR 2025; submitted a survey and meta-analysis paper on 3D tasks empowered by multimodal large language models to TPAMI; proposed a 3D multimodal model for general 3D learning, Point-Bind, and the first 3D large language model, Point-LLM; proposed a two-stage framework, CapeFormer, for category-agnostic pose estimation; offered an alternative and new solution for continual test-time adaptation with Decorate the Newcomers; alleviated the domain gap caused by mixed fog influence and style variation without labels in Both Style and Fog Matter; proposed a reinforced motion transformation network, REMOTE, for semi-supervised 2D pose estimation in videos.

Research Experience

Currently a member of both the Visual Geometry Group and Active Vision Group at the University of Oxford; previously a full-time researcher at Shanghai AI Lab, under the supervision of Prof. Chao Dong.

Education

Obtained Bachelor's and Master's degrees from Wuhan University in 2018 and 2021; currently a DPhil student at the Department of Engineering Science, University of Oxford, since October 2023, supervised by Prof. Victor Prisacariu and Prof. Iro Laina.

Background

Research interests include LLMs, 3D computer vision, and robotics, especially using LLMs' world knowledge to enhance 3D world understanding and interaction.

Co-authors

13 total