About the job
Join Microsoft AI in building one of the world’s most advanced foundation models. Our mission is to push the boundaries of scale, performance, and deployment, creating frontier AI systems that power transformative experiences across Microsoft. The Multimodal team tackles some of the most challenging problems in deep learning at scale. Our work forms the backbone of initiatives across Microsoft AI, enabling breakthroughs that shape the future of AI.
Responsibilities
Develop algorithms, design model architectures, conduct experiments, champion measurement and evaluation, innovate datasets and data pipelines.
Improve training and deployment efficiency, paying careful attention to detail, persevering, and learning from everyone’s attempts whether successful or not.
Follow a rigorous data-driven approach grounded in meticulous ablation studies and scientific analysis.
Innovate and iterate over ideas, prototypes, and product.
Collaborate closely with teams on infrastructure, data engineering, pre-training, post-training, and product feedback.
Advance the AI frontier responsibly.
Embody our culture and values.
Qualifications
Minimum
Bachelor's Degree in AI, Computer Science, Data Science, Statistics, Physics, Engineering, or related technical discipline AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.
Preferred
Master's Degree in in AI, Computer Science, Data Science, Statistics, Physics, Engineering, or related technical discipline AND 8+ years technical engineering experience with coding in languages including, but not limited to, Python and common data libraries (Pandas, NumPy, etc.)OR Bachelor's Degree in AI, Computer Science, Data Science, Statistics, Physics, Engineering, or related technical discipline AND 12+ years technical engineering experience with coding in languages including, but not limited to, Python and common data libraries (Pandas, NumPy, etc.)OR equivalent experience.Experience with large-scale AI systems — design and deployment of distributed architectures, multimodal or conversational models; proficiency with ML frameworks (e.g., PyTorch, TensorFlow) and cloud/HPC environments (e.g., Azure). Expertise in data engineering for foundation models — multimodal dataset design, curation, annotation pipelines, quality evaluation, bias detection, and understanding of privacy, compliance, and Responsible AI principles. Background in LLM interaction and deployment — practical work in prompt engineering, safety-aligned evaluation, and integration of conversational AI into production systems. Cross-functional collaboration and communication — ability to produce clear technical documentation, partner with engineering, product, and design teams, and contribute to knowledge sharing; demonstrated application of emerging AI technologies and best practices.