Applied Scientist

About the job

At Amazon Selection and Catalog Systems (ASCS), our mission is to power the online buying experience for customers worldwide so they can find, discover, and buy any product they want. We innovate on behalf of our customers to ensure uniqueness and consistency of product identity and to infer relationships between products in Amazon Catalog to drive the selection gateway for the search and browse experiences on the website. We're solving a fundamental AI challenge: establishing product identity and relationships at unprecedented scale. Using Generative AI, Visual Language Models (VLMs), and multimodal reasoning, we determine what makes each product unique and how products relate to one another across Amazon's catalog. The scale is staggering: billions of products, petabytes of multimodal data, millions of sellers, dozens of languages, and infinite product diversity—from electronics to groceries to digital content.

Responsibilities

Formulate novel research problems at the intersection of GenAI, multimodal learning, and large-scale information retrieval—translating ambiguous business challenges into tractable scientific frameworks

Design and implement leading models leveraging VLMs, foundation models, and agentic architectures to solve product identity, relationship inference, and catalog understanding at billion-product scale

Pioneer explainable AI methodologies that balance model performance with scalability requirements for production systems impacting millions of daily customer decisions

Own end-to-end ML pipelines from research ideation to production deployment—processing petabytes of multimodal data with rigorous evaluation frameworks

Define research roadmaps aligned with business priorities, balancing foundational research with incremental product improvements

Mentor peer scientists and engineers on advanced ML techniques, experimental design, and scientific rigor—building organizational capability in GenAI and multimodal AI

Represent the team in the broader science community—publishing findings, delivering tech talks, and staying at the forefront of GenAI, VLM, and agentic system research

Qualifications

Minimum

PhD, or Master's degree and 4+ years of CS, CE, ML or related field experience

Experience programming in Java, C++, Python or related language

2+ years of building machine learning models or developing algorithms for business application experience

Preferred

Prior experience in the domains of LLMs, foundation models, or large-scale deep learning systems

Publications in top-tier venues such as NeurIPS, ICML, ICLR, CVPR, ICCV, EMNLP, ACL, NAACL, COLING, KDD, SIGMOD, WWW, AAAI, or similar

Experience with Visual Language Models (VLMs), multimodal transformers, or vision-language pretraining

Experience with explainable AI, model interpretability, or uncertainty quantification

Strong experimental design skills and statistical analysis expertise

Hands-on experience with Generative AI, including prompt engineering, fine-tuning, RLHF, or agentic architectures

Track record of deploying ML models at scale in production environments processing billions of data points

Excellent written, verbal communication & data presentation skills.