About the job
We are now looking for a Senior GPU & Deep Learning Architect! The NVIDIA GPU Architecture group is looking for world class architects and software developers to join and lead our various architecture efforts. A key part of NVIDIA's strength is to innovate in the graphics and parallel computing fields delivering the highest performance in the world for deep learning and parallel processing algorithms. We are constantly looking for ways to improve our GPU architecture, especially for deep learning workloads, both training and inference, and maintain our leadership by developing new parallel programming models, and new architectures required to make this successful. In this position, you will be responsible for developing and enhancing various features in the GPU architecture that advance the state of the art in parallel programming models or parallel computing performance. You would interact with other world-class architects and researchers to build simulators, mapping deep learning workloads to current and future hardware, and validate new architectural features.
Responsibilities
Design new hardware features for future processing architectures targeted at deep learning workloads, for both training and inference.
Advance the state of parallel computation.
Be knowledgeable about future parallel programming models and their impact to hardware.
Develop software for various hardware simulators, test infrastructures or metrics systems including databases.
Work in a team to document, design, develop tools to analyze and simulate, validate, and verify functional or performance models.
Develop tests, testplans, and testing infrastructure for new graphics or parallel processing architectures
Be hungry to learn and work on simulators, RTL and real silicon.
Qualifications
Minimum
MS in Computer Science, Electrical Engineering or Computer Engineering or equivalent experience.
Experience in working with hardware targeted at deep learning, or working on mapping deep learning algorithms to hardware.
8+ years of relevant industry experience in GPU or other parallel programming architectures (or other equivalent experience).
Strong programming ability in C, C++, Perl and Python.
Background in computer architecture, parallel processing, signal processing and/or high performance computing.
Preferred
Knowledge of state of the art in DL algorithms and attention mechanisms is a huge plus.