About the job
NVIDIA is searching for an intern who can develop tools, maintain, and handle high performance Infrastructure Systems, that runs AI/Machine Learning video analytics workloads, using NVIDIA Data center GPUs for our Metropolis Platform. This exciting role will require someone who can build tools to automate software deployments, monitor systems, analyze logs and software delivery for the Metropolis Lab.
Responsibilities
Help Automate parts of AI Model production and processes for software delivery.
Monitor, Collect and analyze system and application logs.
Use of Prometheus to survey system metrics.
Help setup and run Kubernetes systems
Maintain and configure on-prem test system with automation.
Create and setup CI/CD automation via GitLab runners and/or Jenkins pipeline scripting
Develop and maintain DevOps MCP servers to automate development processes.
Qualifications
Minimum
Pursuing BS or MS in Computer Science or Engineering.
Basic Linux system administration skills.
Ability to code with Python or similar languages.
Understanding of operating systems (Linux) and computer architecture.
Knowledge of building web applications and databases.
Knowledge of Docker and Kubernetes environments.
Preferred
Understanding of building and working with AI models
System administration BareMetal and/or Cloud environments
Good programming skills with Python/ReactJs/NodeJs
Good understanding Docker Kubernetes and Helm