Engineering Manager, Inference

Anthropic
San Francisco, CA | New York City, NY | Seattle, WA / New York City, NY, New York, New York, United States / Remote-Friendly US (Travel Required)2025-05-29

About the job

Anthropic’s performance and scaling teams focus on making the most efficient and impactful use of our compute resources, be it inference or training. As an Engineering Manager on these teams you will be responsible for ensuring you and your team are identifying and removing bottlenecks, building robust and durable solutions, and maximizing the efficiency of our systems. You also will help bring clarity, focus, and context to your teams in a fast paced, dynamic environment.

Responsibilities

Provide front-line leadership of engineering efforts to improve model performance and scale our inference and training systems

Become familiar with the team’s technical stack enough to make targeted contributions as an individual contributor

Manage day-to-day execution of the team's work

Prioritize the team’s work and manage projects in a highly dynamic, fast paced environment

Coach and support your reports in understanding, and pursuing, their professional growth

Maintain a deep understanding of the team's technical work and its implications for AI safety

Qualifications

Minimum

Have 1+ years of management experience in a technical environment, particularly performance or distributed systems

Have a background in machine learning, AI, or a similar related technical field

Are deeply interested in the potential transformative effects of advanced AI systems and are committed to ensuring their safe development

Excel at building strong relationships with stakeholders at all levels

Are a quick learner, capable of understanding and contributing to discussions on complex technical topics

Have experience managing teams through periods of rapid growth and change

Are a quick study: this team sits at the intersection of a large number of different complex technical systems that you’ll need to understand (at a high level of abstraction) to be effective

Preferred

High performance, large-scale ML systems

GPU/Accelerator programming

ML framework internals

OS internals

Language modeling with transformers