About the job
At Microsoft AI, we are on a mission to develop the most cutting-edge algorithms for post-training large language models (LLMs) and ship those models to millions of users using Copilot every day. The AI Post-Training team at Microsoft AI is responsible for all aspects of post-training and improving our pre-trained models to advance the state-of-the-art on a wide variety of internal and external benchmarks. Our goal is to push our models’ capabilities in reasoning and instruction following, math, code, and tool use and agentic tasks, among many other areas. This role involves contributions to all stages of the post-training process: driving data collection and acquisition, building evaluations of model capabilities, and applying advanced reward modeling and RL techniques to develop and improve the post-training recipe. We work on the bleeding edge and leverage the most powerful pretrained models and algorithms for our needs. We are an interdisciplinary team of engineers and scientists, learning from each other and collaborating to create the best models. We are looking for outstanding individuals excited about contributing to the next generation of models that will transform the field.
Responsibilities
Develop data collection, evaluation, and post-training methods for models.
Design hypotheses and experiment plans for rapidly iterating on model performance.
Qualifications
Minimum
Bachelor's Degree in Computer Science, Machine Learning, Mathematics, or related technical discipline AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.
Have experience with reward modeling, RL, or other post-training techniques.
Preferred
Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR Master's Degree in Computer Science or related technical field AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.
Demonstrated experience in large-scale AI.
Passionate about conversational AI and its deployment.
Demonstrated written and verbal communication skills with the ability to work closely with cross-functional teams, including product managers, designers, and other engineers.
Passion for learning new technologies and staying up to date with industry trends, best practices, and emerging technologies in AI.
Proven ability to collaborate and contribute to a positive, inclusive work environment, fostering knowledge sharing and growth within the team.