Forward Deployed Engineer - ML

Modal
New York City / San Francisco / Stockholm2026-02-23

About the job

We're looking for Forward Deployed ML Engineers who want to work at the intersection of deep technical work and direct customer impact. As an ML FDE, you'll partner with leading AI companies and foundation model labs to help them achieve state-of-the-art performance on their most demanding workloads — LLM serving, model training (SFT, RLHF), audio pipelines, scientific computing, and more. You're helping teams reach outcomes most engineers can't on their own.

Responsibilities

- Work hands-on with companies like Suno, Lovable, Cognition, and Meta to architect and optimize production AI workloads on Modal

- Contribute to open-source projects — members of the team are active contributors to SGLang — and publish technical content that demonstrates Modal's capabilities across the AI stack

- Collaborate with Modal's product and sales teams, contributing to the platform as both an engineer and a product stakeholder

- Build trusted relationships with technical leaders (CTOs, VPs of Engineering, ML leads) at companies doing frontier AI work

- Conduct technical demos, experiments, and proof-of-concepts that make Modal's performance advantages tangible

Qualifications

Minimum

- 2+ years of professional ML engineering experience, ideally with hands-on work in inference optimization, model training, GPU programming, or ML infrastructure

- Familiarity with the serving (e.g., vLLM, SGLang) and training (e.g., slime, verl, TRL) toolchains. You don't need all of these, but you should be able to go deep on at least one.

- Strong communicator who can go deep on technical architecture with an engineering team and clearly articulate tradeoffs to technical leadership

- Genuine interest in working directly with customers — you find it energizing to understand someone else's problem and help them solve it

- Willing to work in-person in New York City, San Francisco, or Stockholm

Preferred

- Bonus: side projects, open-source contributions, or published work you're proud of in ML or systems performance