Machine Learning Scientist 5 - Multi-modal Algorithms for Games

About the job

At Netflix, our mission is to entertain the world. Together, we are writing the next episode - pushing the boundaries of storytelling, global fandom and making the unimaginable a reality. We are a dream team obsessed with the uncomfortable excitement of discovering what happens when you merge creativity, intuition and cutting-edge technology. Come be a part of what’s next. The Studio Media Algorithms team is at the forefront of algorithmic innovation to enhance and support the creation of Netflix’s entertainment content, including games. In this role, you will be embedded within this team while collaborating very closely with a specialized Games Studio R&D team. This incubation-style team is chartered to lead our investments in building new kinds of games leveraging emerging technologies to support our creators and reach player audiences in new ways.

Responsibilities

Model Adaptation & Alignment: Design and own the fine-tuning and alignment of LLMs and VLMs in PyTorch, leveraging modern preference learning and reinforcement learning to enhance reasoning, tool-use, and agentic workflows for interactive game systems.

Algorithmic Model Optimization: Lead efforts in model compression—specifically knowledge distillation, structural pruning, and architectural refinement—to create efficient variants of large models that meet strict latency, cost, and quality constraints.

Generative Visuals & Diffusion: Develop and optimize Diffusion-based models for Image, Video, and 3D generation, including distillation and efficiency techniques for viable game-time performance.

Pragmatic Model Integration: Strategically evaluate and integrate SOTA open-source and commercial models while building internal "layers," adapters, and enhancements to fill gaps in creative control.

Multi-modal Interaction: Optimize and integrate audio (ASR/TTS), language, and vision models to enable low-latency, cross-modal reasoning and interaction.

Qualifications

Minimum

Multi-modal Architecture Expertise: Strong foundation in deep learning architectures, with deep expertise in Transformers and Diffusion architectures powering LLMs, VLMs, and generative visuals, including their specific performance bottlenecks.

Optimization Specialist: Proven track record in algorithmic model optimization (e.g., distillation, quantization-aware training, or pruning) to reduce FLOPs and memory footprint.

Data-Centric Mindset: Skilled in data cleaning, curation, and the creation of synthetic data for complex evaluation and training pipelines.

Pragmatic Builder: Ability to prioritize impact by deciding when to use commercial APIs/OSS weights versus when to invest in proprietary R&D to solve efficiency or quality problems.

Programming: Expert proficiency in Python and deep learning frameworks (such as PyTorch); ability to collaborate with engineering on low-level performance constraints.

Preferred

Prior experience optimizing models for heterogeneous hardware (Mobile, Cloud GPU, and custom edge devices).

Expertise in audio-visual multimodal models and video generation.