About the job
As a Machine Learning Engineer, you will design and build cutting-edge AI/ML systems that drive meaningful business outcomes at scale. You will work cross-functionally to bring innovative machine learning solutions from research and experimentation through to robust, production-grade deployment.
Responsibilities
Deploy, monitor, and support AI tools in production environments, ensuring reliability and performance.
Contribute to the ongoing improvement of ML infrastructure, tooling, and best practices.
Partner with data scientists, and engineers to translate business requirements into technical ML solutions.
Conduct rigorous model evaluation, testing, and iteration to continuously improve model quality and efficiency.
Design and integrate LLM-powered features and AI agent workflows into production systems, ensuring reliability, scalability, and performance.
Build and maintain agentic pipelines that leverage tool use, memory, and multi-step reasoning to automate complex business processes.
Evaluate and benchmark LLM outputs as part of the model evaluation lifecycle, assessing quality, latency, and safety in production contexts.
Qualifications
Minimum
8 years of related experience building high-throughput, scalable applications or machine learning models in a production environment.
Bachelor's Degree in Computer Science, Statistics, Data Mining, Machine Learning, Operations Research, or related field.
Proficiency in one or more object-oriented programming languages such as Python, Java, or C++, with hands-on experience building distributed systems.
Experience building large-scale machine learning systems using big data technologies such as Spark, SQL, Snowflake, or similar platforms.
Experience with ML frameworks such as TensorFlow, PyTorch, or scikit-learn.
Familiarity with MLOps practices including model versioning, CI/CD pipelines, and experiment tracking tools such as MLflow or similar.
Experience building and deploying applications using large language models (e.g., GPT-4, Claude, Gemini, or open-source alternatives) via APIs or self-hosted inference.
Hands-on experience with agentic frameworks such as LangChain, LlamaIndex, or AutoGen to build multi-step, tool-augmented AI workflows.
Preferred
10 years of related experience building high-throughput, scalable applications or machine learning models in a production environment.
Solid understanding of ML fundamentals including supervised/unsupervised learning, model evaluation, and feature engineering.
Strong problem-solving skills with the ability to translate ambiguous business problems into well-defined ML solutions.
Excellent cross-functional communication skills with the ability to collaborate effectively across engineering and data science teams.
Familiarity with LLM evaluation practices including output quality assessment, hallucination detection, and latency benchmarking in production environments.