Staff Research Engineer, Model Efficiency

Cohere
Toronto, Montreal, San Francisco, New York, Paris, Seoul, London / Remote-flexible2025-11-07Remote

About the job

As a Staff Research Engineer, you will develop, prototype, and deploy techniques that materially improve how fast and efficiently our models run in production.

Responsibilities

- Develop, prototype, and deploy techniques that materially improve how fast and efficiently our models run in production

Qualifications

Minimum

- Have a PhD in Machine Learning or a related field

- Understand LLM architecture, and how to optimize LLM inference given resource constraints

- Have significant experience with one or more techniques that enhance model efficiency

- Strong software engineering skills

- Publications at top-tier conferences and venues (ICLR, ACL, NeurIPS)

- Passion to mentor others

Preferred

No preferred qualifications listed.