Scholar

Ting Yao

Google Scholar ID: 7Yc6yssAAAAJ

HiDream.ai, previously JD.com and Microsoft Research

Video UnderstandingVision and LanguageComputer Vision

Citations & Impact

All-time

Citations

17,944

H-index

i10-index

132

Publications

Co-authors

Contact

Publications

1 items

2026

Cited

Resume (English only)

Academic Achievements

2022 Chinese Intelligent Computing Technology Innovator
2022 First Grade Scientific and Technology Prize, China Society of Image and Graphics (CSIG) – “Key Technologies and Applications of Ultrafine Image Recognition”
2022 IEEE ICME Multimedia Star Innovator Award – for outstanding innovative contribution in Multimedia Intelligence
2022 Nicolas D. Georganas Best Paper Award, ACM TOMM – “Smart Director: An Event-Driven Directing System for Live Broadcasting”
2019 IEEE Computer Society TCMC Rising Star Award – for contributions in video content recognition and description generation
2019 ACM SIGMM Rising Star Award – for contributions in activity recognition and video captioning
2015 ACM SIGMM Outstanding Ph.D. Thesis Award – “Multimedia Search by Self, External, and Crowdsourcing Knowledge”
2021 Best Open Source Award, ACM MM – “X-modaler: A Versatile and High-performance Codebase for Cross-modal Analytics”
2021 Outstanding Associate Editor, IEEE Transactions on Multimedia
2019 Second Place Best Demo Award, ACM MM – “Animating Your Life: Real-Time Video-to-Animation Translation”
2016 Best Paper Finalist, ACM ICMR
2013 Best Paper Award, ICIMCS
Champion in 10+ international multimedia analytics competitions
Rank 1, No Interaction Track, ICLR 2022 Workshop on Generalizable Policy Learning in the Physical World
Rank 1, No Restriction Track, ICLR 2022 Workshop on Generalizable Policy Learning in the Physical World

Background

Currently Co-Founder and CTO of HiDream.ai, leading it to become China's top Generative AI company.
Research interests include video content understanding, video captioning, multimodal analytics, and spatiotemporal modeling.
Co-authored over 100 peer-reviewed papers in top-tier conferences and journals.
His Pseudo-3D network has become a standard 3D convolutional architecture for spatiotemporal data analysis.
His MSR-VTT video-to-text dataset is used by 500+ institutions worldwide.
His research has led to multiple commercial products with millions of daily active users.
Serves as Associate Editor for IEEE Transactions on Multimedia, Pattern Recognition Letters, and Multimedia Systems; frequently acts as Area Chair and keynote/tutorial speaker at major conferences.

Co-authors

0 total

Co-authors: 0 (list not available)