Senior Applied AI Engineer – ML for Systems & Infrastructure

Databricks
Mountain View, CA, USA2025-02-12

About the job

As a Senior Applied AI Engineer at Databricks, you will apply machine learning, scheduling and optimization algorithms to improve the efficiency and performance of our engineering systems and infrastructure. From cluster management all the way down to query compilation, our Applied AI team works on some of the hardest, most interesting problems facing the business, making Databricks infrastructure and products as performant and cost-efficient as possible. This is a high impact problem as our customers look at us to deliver the most optimized workloads.

Responsibilities

Build end-to-end systems from the ground up in a small team of experienced people.

Shape the direction of our applied ML areas of investment by engaging with engineering and product teams across the company.

Drive the development and deployment of state-of-the-art AI models and systems that directly impact the capabilities and performance of Databricks' products, infrastructure and services.

Architect and implement robust, scalable ML infrastructure, including data storage and processing, model training and serving components, and monitoring and reporting systems to support seamless integration of AI/ML models into production environments.

Work on novel modeling techniques in the field of ML for Systems

Contribute to the broader AI community by publishing research, presenting at conferences, and actively participating in open-source projects, enhancing Databricks' reputation as an industry leader.

Qualifications

Minimum

2-8 years of machine learning engineering experience in high velocity, high-growth companies

Strong understanding of both computer systems and statistics

Large breadth of knowledge or interest in mathematical modeling beyond ML (OR, combinatorial optimization

Experience developing AI/ML systems at scale in production

Strong track record of ML modeling that goes beyond using standard libraries.

Strong coding and software engineering skills, and familiarity with software engineering principles around testing, code reviews and deployment.

Experience deploying, scaling and monitoring models in production; deep understanding of the unique infrastructure challenges posed by training and serving predictions in Tier 0 environments.

Preferred

No preferred qualifications listed.