Scholar

Lu Jiang

Google Scholar ID: jIKjjSYAAAAJ

Research Scientist @ Apple

Generative AIFoundation ModelRobust Deep LearningMultimediaVideo Generation

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

13,546

H-index

i10-index

Publications

Co-authors

list available

Contact

CVOpen ↗TwitterOpen ↗GitHubOpen ↗LinkedInOpen ↗

Publications

24 items

Recovering Policy-Induced Errors: Benchmarking and Trajectory Synthesis for Robust GUI Agents

2026

Cited

From Passive Reuse to Active Reasoning: Grounding Large Language Models for Neuro-Symbolic Experience Replay

2026

Cited

Modality-Agnostic Prompt Learning for Multi-Modal Camouflaged Object Detection

2026

Cited

Uni-FinLLM: A Unified Multimodal Large Language Model with Modular Task Heads for Micro-Level Stock Prediction and Macro-Level Systemic Risk Assessment

arXiv.org · 2026

Cited

SkipSR: Faster Super Resolution with Token Skipping

2025

Cited

SciTopic: Enhancing Topic Discovery in Scientific Literature through Advanced LLM

2025

Cited

Mixture of Contexts for Long Video Generation

2025

Cited

Captain Cinema: Towards Short Movie Generation

2025

Cited

Resume (English only)

Academic Achievements

[2025/05] Introduced Seaweed-7B, a cost-efficient foundation model for video generation
[2025/03] Released Long Context Tuning (LCT), enabling scene-level video storytelling up to 5 minutes
[2025/04] Delivered a lecture at DeepLearn
[2025/02] Served as Area Chair for ICML 2025, ICCV 2025, and NeurIPS 2025
[2025/01] Introduced Seaweed-APT, a one-step video generation method for high-quality video synthesis
[2024/10] Served as Area Chair for ICLR 2025 and CVPR 2025, Action Editor for TMLR
[2024/08] Honored to receive the IJCAI-JAIR Best Paper Award
[2024/07] Grateful to receive the ICML Best Paper Award
[2024/07] Gave a keynote at ICME 2024
[2024/08] Appointed Associate Editor for TPAMI
[2024/01] Served as Area Chair for CVPR 2024 and ICML 2024
[2024/01] MAGVIT-v2, a leading video tokenizer powering VideoPoet and WALT, was accepted to ICLR 2024
[2023/12] Announced VideoPoet, my primary 2023 focus, from initial design through v0 to current milestones
[2023/11] Released W.A.L.T, a diffusion-based transformer model for photorealistic video generation in a unified latent space
[2023/11] Released StyleDrop, enabling few-shot personalized text-to-image synthesis
[2023/03] MAGVIT for multi-task video generation was accepted to CVPR 2023 as Highlight
[2023/03] Served as Area Chair for ICCV 2023 and NeurIPS 2023
[2023/01] Introduced MUSE, a masked vision transformer for text-to-image generation
[2022/12] Served as Area Chair for CVPR 2023
[2022/06] Pyramid Adversarial Training (CVPR'22) selected as a Best Paper Finalist
[2022/06] Released code for ViTGAN (ICLR'22)
[2022/06] Controlled Noisy Web Labels dataset (ICML'20) is now available via TFDS
[2021/10] Joined Carnegie Mellon University as Adjunct Faculty
[2021/09] Received Best Reviewer awards at ICML 2020–2021 and Outstanding Reviewer at NeurIPS 2021
[2021/03] Released LeCAM-GAN (CVPR'21), top-ranked on CIFAR-100 and ImageNet (25%)
[2021/05] Gave invited talks on robust deep learning at ICLR 2021 WeaSuL Workshop and CMU LTI
[2020/10] Congrats to Yu Wu on receiving the Google Fellowship 2020
[2020/07] Published our work on robust learning from noisy labels at ICML 2020
[2020/06] Released The Garden of Forking Paths dataset for evaluating multiple plausible futures
[2020/05] Co-organized two CVPR 2020 workshops: AI for Content Creation and Language and Vision
[2019/09] Congrats to intern Junwei Liang on receiving Baidu Scholarship 2019
[2019/09] Served as panelist for NSF America's Seed Fund (SBIR) on AI
[2019/07] Best Paper Candidate at ACL 2019 (top 1%)
[2019/05] Released TIRG (CVPR'19) for vision-language image retrieval
[2019/05] Released activity prediction model (CVPR'19), with demo
[2019/05] Released Eidetic-3D LSTM (ICLR'19)
[2019/03] Gave guest lectures (LTI-11-775) on vision + language at CMU
[2019/01] Released Graph Distillation (ECCV'18) on GitHub

Background

Focused on visual generation and multimodal foundation models. Previously worked at ByteDance/TikTok, Google, and as an Adjunct Faculty member at Carnegie Mellon University. Research addresses real-world challenges in robust deep learning, generative AI, and large-scale multimodal data.

Co-authors

10 total