Research Scientist - Data and State Acceleration - Global Frontier Tech Recruitment Program - 2027 Start (PhD)

TikTok
San Jose, California

About the job

We are looking for talented individuals to join our team in 2027. As a graduate, you will get opportunities to pursue bold ideas, tackle complex challenges, and unlock limitless growth. Launch your career where inspiration is infinite at our Company. Successful candidates must be able to commit to an onboarding date by end of year 2027. Please state your availability and graduation date clearly in your resume.

Responsibilities

Design and implement real-time and offline data architecture for large-scale recommendation systems.

Build scalable and high-performance streaming Lakehouse systems that power feature pipelines, model training, and real-time inference.

Collaborate with ML platform teams to support PyTorch-based model training workflows and design efficient data formats and access patterns for large-scale samples and features.

Own core components of our distributed storage and processing stack, from file format to stream compaction to metadata management.

Qualifications

Minimum

Individuals who are completing or recently completed a PhD in Software Development, Computer Science, Computer Engineering, or a related technical discipline.

Experience building large-scale distributed systems, preferably in storage, stream processing, or ML infrastructure.

Understanding of Apache Flink internals, with hands-on experience in state management, connectors, or UDFs.

Familiarity with modern Lakehouse technologies such as Apache Paimon, Iceberg, Delta Lake, or Hudi, especially around incremental ingestion, schema evolution, and snapshot isolation.

Preferred

Experience in designing and optimizing Flink + Paimon architectures for unified batch/stream processing.

Familiarity with feature storage and training data pipelines, and their integration with PyTorch, especially for large-scale model training.

Knowledge of columnar file formats (Parquet, ORC, Lance) and how they are used in feature engineering or ML data loading.

Proficiency in Java/Scala/C++, and strong debugging/performance tuning ability.

Previous experience in Lakehouse metadata management, compaction scheduling, or data versioning.

Knowledge of legacy data stores like HBase/Kudu.