Real-time Video Researcher

Pika
Palo Alto, CA, USA2026-04-04OnSite

About the job

At Pika, we are pioneering next-generation creative infrastructure built around real-time video generation and intelligent, agentic platforms. We are seeking an accomplished Real-time Video Researcher to drive forward our mission to make agentic real-time video technology accessible, dynamic, and transformative for millions of creators. As a core member of our research team, you will be integral to designing and building foundational technologies, developing novel approaches for real-time video synthesis, and orchestrating intelligent agentic systems that power scalable, interactive multimedia experiences. You will collaborate closely with engineering and product teams, shaping the future of real-time creative platforms.

Responsibilities

Lead and contribute to research efforts focused on real-time video generation, streaming, editing, and orchestration of agentic platform infrastructure

Design and prototype novel algorithms and architectures for real-time, high-fidelity video synthesis and interactive experiences

Focus heavily on real-time aspects of video generation and synthesis

Work on diffusion model distillation and develop diffusion-based world models for video applications

Train and finetune autoregressive models and diffusion models with a focus on real-time performance

Curate specific datasets, especially for camera motion and human motion data

Collaborate with cross-functional teams to bring research advancements into production-ready technologies

Publish work in top-tier conferences and journals, and communicate results internally and externally

Stay at the cutting-edge of the field, monitoring new developments in real-time video, generative AI, multimodal systems, and agentic orchestration

Qualifications

Minimum

5+ years of relevant experience, including research during graduate studies, in real-time video generation, deep learning, or related fields such as image/audio generation and deep experience in multimodals

Demonstrated impact as first author on major publications in top conferences or journals (e.g., CVPR, ICCV, NeurIPS, SIGGRAPH, etc.)

Deep expertise in computer vision, video synthesis, generative models, and machine learning

Hands-on experience with diffusion model distillation and developing diffusion-based world models

Experience training and finetuning autoregressive models, especially for real-time applications

Strong capability and experience in data curation, especially for camera motion and human motion datasets

Experience developing and deploying real-time video systems and/or agentic orchestration infrastructure

Strong programming and prototyping skills (Python, PyTorch, TensorFlow, etc.)

Passion for building creative tools and platforms that empower users

Excellent communication and collaboration skills

Preferred

No preferred qualifications listed.