Senior Manager, AI Inference

About the job

At Red Hat we believe the future of AI is open and we are on a mission to bring the power of open-source LLMs and vLLM to every enterprise. Red Hat Inference engineering team accelerates AI for the enterprise and brings operational simplicity to GenAI deployments. As leading maintainers of the vLLM and LLM-D projects, and inventors of state-of-the-art techniques for model quantization and sparsification, our team provides a stable platform for enterprises to build, optimize, and scale LLM deployments. As a Senior Engineering Manager of the Machine Learning Engineering team focused on vLLM inference, you will be at the forefront of innovation, collaborating with our team to tackle the most pressing challenges in model performance and efficiency. Your technical and people leadership with machine learning and high performance computing will directly impact the development of our cutting-edge software platform, helping to shape the future of AI deployment and utilization. You would be joining the core team behind 2025's most popular open source project on GitHub. If you are someone who wants to contribute to solving challenging technical problems at the forefront of deep learning in the open source way, this is the role for you.

Responsibilities

Setting the overarching vision, objectives, and strategies for a group of engineering teams and engineering managers.

Lead and inspire a distributed team of individual contributors and managers, fostering a collaborative and innovative work environment.

Engage with the AI and machine learning open source communities such as vLLM, llm-d, and other open source communities.

Work with product management and engineering teams to develop technology roadmaps and schedules, and communicate these schedules externally.

Work with cross-functional engineering managers and teams on documentation, product management, and quality assurance to coordinate tasks necessary for releasing enterprise-quality MLOps software.

Working closely with the technical leads and scrum leads to direct the team in agile development.

Meet and present technical presentations to customers and partners as necessary.

Mentor and nurture team members in their career development, and professional growth.

Recruit and build a world class engineering team.

Qualifications

Minimum

10+ years of significant hands-on software development and system design experience

5+ years of experience in managing software engineering teams

Proven experience in leading machine learning engineering teams, with a track record of successful project delivery and development of software engineers and engineering managers

Preferred

Experience in machine learning frameworks and tools, such as PyTorch, and HuggingFace.

Extremely proficient in running and analyzing LLM performance and accuracy benchmarks across a variety of accelerators.

Excellent programming skills in languages like Python, and C++/CUDA.

Experience with developing and scaling applications with Kubernetes.

Experience using AI-assisted development tools (e.g., GitHub Copilot, Cursor, Claude Code) for code generation, and a multi-agent driven orchestration and automation to accelerate development cycles and enhance code quality.

Excellent written and verbal communication skills.

Ability to lead and work with diverse and distributed teams from multiple countries and cultures.