About the job
NVIDIA is seeking extraordinary architects to develop processor and system architectures that accelerate machine learning, data analytics and high-performance computing applications. This position offers the chance to create a relevant impact in a dynamic, technology-focused company.
Responsibilities
Validate and analyze performance of GPU-accelerated system and software architectures that advance the frontier of deep learning performance.
Debug deep learning and data analytics software to identify root causes of performance bottlenecks.
Develop scripts and tools to analyze, visualize, and debug software using analytical models, simulators, and test suites
Work with the CUDA and AI Compiler teams to pinpoint and resolve performance issues
Engage AI/ML training and inference performance teams to identify and optimize critical deep learning layers
Collaborate with hardware architecture performance teams to define expectations for emerging deep learning hardware features
Qualifications
Minimum
Master's or PhD in Computer Science, Electrical Engineering or Computer Engineering, or equivalent experience.
Proven expertise in software design, including debugging, performance analysis, and test development
Hands-on experience with practical parallel programming, even if it’s not on GPUs.
Strong understanding of computer architecture, with practical experience on performance debugging.
Ability to identify bottlenecks, optimize resource utilization, and enhance system throughput
Fluency in programming languages such as Python, C, C++.
Preferred
Strong foundation in machine learning and deep learning fundamentals to complement your expertise in computer architecture.
A strong background in high performance power efficient designs, energy efficient high-performance computing, performance analysis and profiling to identify performance bottlenecks.
Experience and familiarity with GPU computing and parallel programming models.
Work experience with analytical performance modeling, profiling, and analysis