Applied AI Engineer – Agentic Workflows

Cohere
Toronto / New York / San Francisco2026-01-06Remote

About the job

We’re a fast-growing startup building production-grade AI agents for enterprise customers at scale. We’re looking for Software Engineers with Applied AI experience who can own the design, build, and deployment of agentic workflows powered by Large Language Models (LLMs)—from early prototypes to production-grade AI agents, to deliver concrete business value in enterprise workflows.

Responsibilities

Work closely with enterprise customers to translate high-value, ambiguous business problems into well-framed agentic problems with clear success criteria and evaluation methodologies.

Provide technical leadership across the full development and evaluation lifecycle, including post-deployment iteration, for agentic workflows.

Contribute to shared frameworks and patterns that enable consistent delivery across customers.

Lead the design, build, and delivery of LLM-powered agents that reason, plan, and act across tools and data sources with enterprise-grade reliability and performance.

Balance rapid iteration with enterprise requirements, evolving prototypes into stable, reusable solutions.

Define and apply evaluation and quality standards to measure success, failures, and regressions.

Debug real-world agent behavior and systematically improve prompts, workflows, tools, and guardrails.

Mentor engineers across distributed teams.

Drive clarity in ambiguous situations, build alignment, and raise engineering quality across the organization.

Qualifications

Minimum

Substantial experience building, shipping, and maintaining production-grade software (Python/TypeScript). You understand how to write clean, testable, observable and scalable code.

Hands-on experience building agents that plan and execute multi-step tasks (ReAct, Plan-and-Execute) and interact with external APIs/tools.

Deep familiarity with Frontier Models (GPT, Claude, Gemini), RAG, vector databases (Pinecone, Weaviate, etc.), and orchestration frameworks (LangGraph, CrewAI, or custom state machines).

Proven ability to move beyond "trial and error" by building robust evaluation frameworks to measure agent accuracy, safety, and latency.

Experience leading technical discussions with enterprise customers to translate ambiguous business needs into concrete technical specs.

Experience mentoring distributed teams and setting the architectural standards for AI/Agentic systems.

Preferred

Strong written and verbal communication skills.

Ability and interest to travel up to 25%, flexible.