About the job
Our team is responsible for designing and developing fundamental architectural changes for GPU delivery, health monitoring, triage automation, and diagnostic services. These are essential for running distributed AI/ML/HPC workloads across thousands of GPUs, leveraging technologies like RoCE and Infiniband.
Responsibilities
No responsibilities listed.
Qualifications
Minimum
No minimum qualifications listed.
Preferred
No preferred qualifications listed.