About the job
We are looking for outstanding Senior Deep Learning Software Engineers to develop and productize NVIDIA's deep learning solutions in autonomous driving vehicles. In the Solution Engineering-Automotive Machine Learning team, we are developing new technologies to allow more capable deep learning models to be deployed in Physical AI systems. As part of the role, you will develop compiler technology to allow larger and better models to be optimized to leverage NVIDIA’s unique hardware architecture. You will also be exposed to the most pressing problems that our partners face during product development and coordinate with other architecture and software teams to develop the best solution for partners working on our platforms.
Responsibilities
Developing compiler technologies to accelerate deep learning inference on NVIDIA hardware platforms for Physical AI.
Working across a wide range of abstractions from model fine-tuning and quantization to low-level kernel development and performance optimization.
Develop workflows that let users leverage frameworks (e.g. PyTorch, JAX) and compiler technologies tools (e.g. MLIR, Triton) without forgoing performance
Work with customers to help accelerate their workloads on NVIDIA platforms.
Stay up to date with the latest research and innovations in deep learning, implement and experiment with new insights to improve NVIDIA's Physical AI DNNs.
Qualifications
Minimum
MS or PhD degree in computer science, computer vision, robotics, computer architecture or equivalent experience in technical field (or equivalent experience)
5+ years of work experience in software development.
2+ years of experience in **developing** deep learning frameworks (e.g. PyTorch, JAX, TensorFlow, ONNX, etc.) or compiler technologies (e.g. LLVM, MLIR, TVM, Triton, etc.).
Domain experience in technologies used for GPU programming (e.g. CUDA C++ and/or DSLs like OpenAI Triton) or with system-level optimization for deep learning training or inference.
Strong C/C++ programming skills
Familiar with start-of-the-art deep learning techniques for inference and training.
Willing to take action and have strong analytical skills.
Preferred
Experience with MLIR or LLVM or similar compiler technologies
Background with low precision inference, quantization, compression of DNNs
Experience with GPU programming
Experience with building DSLs or optimizing compilers (e.g. graph compiler or kernel generator) for GPUs or other accelerated computing platforms.
Open source project ownership or contribution, healthy GitHub repositories, guiding and/or mentoring experience