Published several papers in international journals covering topics such as online action detection, face recognition, license plate recognition, and more.
- A paper on online action detection was published in IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI).
- Work exploring the combination of language and video modalities for temporal moment localization appeared in Computer Vision and Image Understanding.
- Other research achievements include works on image super-resolution, improving efficiency in continual learning, etc.
- Research results were also presented at international conferences, such as MPViT: Multi-Path Vision Transformer for Dense Prediction at CVPR.
Research Experience
Involved in multiple phased research projects focusing on developing visual language technologies, memory storage networks, situational reasoning and cognition technologies, etc.
- Phase 1 (3 years): Development of technology for expressing and retrieving visual information, and building a visual memory network for abstracting and storing visual information.
- Phase 2 (3 years): Development of situational recognition and reasoning technologies based on long-term and short-term visual memories, improving understanding of relationships and situational cognition through contextual information fusion.
- Phase 3 (2 years): Development of evidence-based predictive visual intelligence technology, using multi-modal meta-learning to predict future situations.
Background
Research interests include but are not limited to: application of deep learning in visual intelligence technologies, cross-modal pre-training techniques, and edge AI-based visual intelligence solutions.