About the job
Microsoft AI is hiring a Member of Technical Staff, AI Networking to design and scale the world’s most advanced high-performance networks powering Copilot and next-generation AI systems. Join the team building the fabric that connects frontier-class datacenters, enables multi-gigawatt AI supercomputers, and supports the training of the most sophisticated AI models on the planet.
Responsibilities
Advanced ROCE transport design, congestion control, ECN/WRED/DCTCP tuning
Fabric architecture, topology planning, network modeling, and scaling strategy
Telemetry, observability, reliability engineering, and automated troubleshooting
Develop and tune the deployment of novel routing techniques to achieve reliability in large networks
Work with world class network designers like NVIDIA, Broadcom, and in-house silicon/network co-design teams
AI training + inference cluster bring-up, performance benchmarking, and root-cause analysis
Gather data and insights to develop the pretraining compute roadmap
Find a path to get things done despite roadblocks to get your work into the hands of users quickly and iteratively
Enjoy working in a fast-paced, design-driven, product development cycle
Embody our Culture and Values
Qualifications
Minimum
Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.
Preferred
Master's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR Bachelor's Degree in Computer Science or related technical field AND 12+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.