Senior AI-Native Systems Software Engineer, TensorRT

About the job

Are you passionate about redefining how software is built in the age of Generative AI? Join NVIDIA’s TensorRT team to help lead a first-of-its-kind, AI-native initiative designed to make TensorRT the default entry point for out-of-framework inference globally. We are moving beyond traditional development cycles with a new framework built from the ground up to leverage swarms of AI agents to produce high-performance, high-quality, modern C++ software at an unprecedented scale.

Responsibilities

Architecting an AI-native framework: Help design and build a codebase and architecture that scales beyond human capacity, supporting large numbers of AI agents working in parallel to generate, test, and validate production-grade software.

Scaling through agentic workflows: Improve the ratio of compute-to-software output by adopting and building AI-native tools, multi-agent orchestrators, and codebase harnesses that keep humans focused on the highest-value work..

Rapid prototyping with SOTA models: Act as a technical scout, identifying industry and academic breakthroughs (e.g., new attention mechanisms, KV cache strategies) and dispatching AI agent swarms to prototype and integrate these capabilities into our framework.

Delivering a great user experience: Ensure a seamless, high-performance path to production for the latest model families (LLMs, Diffusion, Audio, Vision and multi-modal models).

Extreme performance optimization: Work at the intersection of Python orchestration and C++ engine-level optimizations to achieve major latency and throughput gains for critical customer use cases.

Qualifications

Minimum

BS, MS, or PhD in Computer Science, Computer Engineering, AI, or equivalent experience.

4+ years of relevant software development experience.

Strong modern C++ skills: Proficiency with C++11/14/17 (or newer) and the STL, with an emphasis on clean, maintainable, performant code.

Deep learning familiarity: Experience with modern inference frameworks and an understanding of the architectural nuances of LLMs, Diffusion, and multi-modal models.

Systems thinking: Interest in how software architecture must evolve to support automated, agent-driven development and indefinitely scaling codebases.

End-to-end product sense: Ability to translate high-level customer needs into concrete technical requirements and user-centric solutions.

Pragmatic execution: Demonstrated ability to go from customer requests to production-quality software on tight timelines.

Collaborative mindset: Excellent communication skills and comfort working across internal organizations and with customers.

Preferred

Agentic framework experience: Hands-on work with AI agent orchestrators or multi-agent coding frameworks, or experience building custom agentic coding harnesses for production software.

CUDA & kernel expertise: Experience with CUDA programming or exposure to kernel generation / autotuning efforts.

High-velocity prototyping: A track record of rapidly turning state-of-the-art papers into working prototypes in days, not weeks.

Performance profiling skills: Expertise in software performance analysis, profiling, and optimization (CPU and/or GPU), including using tooling to drive measurable wins.