About the job
NVIDIA is seeking outstanding Performance Analysis Architects to help analyze and develop the next generation of architectures that accelerate AI and high-performance computing applications.
Responsibilities
Develop innovative HW architectures to extend the state of the art in parallel computing performance, energy efficiency and programmability.
Benchmark and analyze AI workloads in single and multi-node configurations.
Develop high level simulator and analysis tools in C++/Python.
Evaluate PPA (performance, power, area) for hardware features and system-level architectural trade-offs.
Work closely with peer architecture teams and product management to guide development of the products.
Keep abreast with emerging trends and research in deep learning.
Qualifications
Minimum
MS or PhD in a relevant discipline (Computer Science, Electrical Engineering, Computer Engineering, etc) or equivalent experience.
4+ years of experience in parallel computing architectures, interconnect fabrics and deep learning applications.
Background in GPU or Deep Learning ASIC architecture evaluation for training and/or inference.
Strong programming skills in Python and C++.
Preferred
Solid fundamental knowledge in computer architecture and interconnect fabrics.
Understanding of modern transformer-based model architectures.
Ability to simplify and communicate rich technical concepts to non-technical audience.
Have a curious demeanor with excellent problem-solving skills.