About the job
With rapid advances in AGI and foundation models, multimodal content creation across text, image, and video is undergoing a fundamental transformation. This role focuses on building next-generation multimodal and agentic foundation models that enable intelligent, efficient, and end-to-end creative workflows. You will work on full-modal understanding, AIGC-based image and video generation, and agentic systems, optimizing models through large-scale training and post-training (e.g., SFT, RL). The role also involves designing efficient model architectures and advancing reinforcement learning techniques to improve model capability, scalability, and real-world performance in creation scenarios.
Responsibilities
Conduct research and development in generative AI and multimodal models (e.g., image, video).
Develop large-scale foundation models (LLMs/VLMs), through post-training techniques.
Design and build models for creative applications on the TikTok platform, driving end-to-end impact from research to production through cross-functional collaboration.
Explore new AI-driven product opportunities and contribute to next-generation creative experiences.
Qualifications
Minimum
Individuals who are completing or have recently completed a PhD degree in Software Development, Computer Science, Computer Engineering, or a related technical discipline
Proficiency in training generative AI or large models using frameworks such as PyTorch or JAX.
Strong programming skills and solid fundamentals in machine learning.
Strong problem-solving ability and motivation to tackle real-world challenges.
Good communication and collaboration skills in fast-paced environments.
Preferred
Ph.D. in Generative AI, Machine Learning Systems, or a related field, or equivalent experience.
Strong research background in one or more areas: generative AI, LLMs/VLMs, or ML systems.
Hands-on experience in at least one of the following areas: Image/video generation and editing; VLM/LLM fine-tuning; Efficient model design and optimization; Reinforcement learning methods (e.g., RLHF, DPO, GRPO)
Track record of research publications in conferences such as CVPR, ECCV, ICCV, NeurIPS, ICLR, SIGGRAPH, or SIGGRAPH Asia.