Vision Language Models/VLM Research Scientist Graduate (Trust and Safety) - 2026 Start (PhD)

TikTok
San Jose, California

About the job

Our Foundations and Intelligence Service R&D team is fast growing and responsible for building state-of-the-art foundation models, such as LLM, VLM and Omni Models. Our mission is to build a bridge for collaboration between foundation models and downstream business scenarios, and use foundation model powered world knowledge to enhance better user experiences across TikTok, including content moderation, search and recommendations, client AI, etc. We are looking for researchers in LLM, VLM and Omni Model domain who are experienced in single/multi-modality LLM/VLM pretraining and applications, including evaluations, data processing and recipes for pre-training and post-training, reinforcement learning based alignment, efficient training and inference. There are no doubt a lot of unsolved problems in the LLM domain which could have a huge impact on industry and academia. In TikTok, we have real applications, resources and patience for technology incubation.

Responsibilities

- Enhance VLM with specialized features like OCR and captioning to optimize performance for TikTok business applications

- Explore the model architecture and inference-efficient model design to enable scaled application in downstream TikTok business

- Work closely with cross-functional teams to plan and implement projects harnessing VLMs for diverse purposes and vertical domains

- Extend the insights and impact from industry to academia

Qualifications

Minimum

- Individuals who are completing or have recently completed a PhD degree in Computer Science, Data Science, Artificial Intelligence, or a related field

- Proficiency in programming languages such as Python, Rust, or C++ and a track record of working with deep learning frameworks (e.g., pytorch, deepspeed, megatron, vllm, etc.)

Preferred

- Excellent problem-solving skills and a creative mindset to address complex AI challenges. Demonstrated ability to drive research projects from idea to implementation, producing tangible outcomes.

- Published research papers or contributions to the LLM community would be a significant plus.

- Experience with inference tuning and Inference acceleration. Have a deep understanding of GPU and/or other AI accelerators, experience with large scale AI networks, pytorch 2.0 and similar technologies.

- Experience with evaluation of AI systems, LLM application & agent development is desirable.

- Strong understanding of cutting-edge LLM research (e.g., long context, multi modality, alignment research, agent ecosystem, etc.) and possess practical expertise in effectively implementing these advanced systems as a plus

- Strong understanding of distributed computing framework & performance tuning and verification for training/finetuning/inference; Being familiar with PEFT, RL, MoE, CoT or Langchain is a plus