About the job
We are looking for talented individuals to help us develop state-of-the-art models for information retrieval as part of our Search team. This group is working on a range of tasks including training our embedding and reranker models. You'll have the opportunity to revolutionise people's search experience by contributing to building an intelligent, efficient and precise search system and you would have a lot of opportunity to try new things out, innovate, and productionize your ideas. Your work will specifically focus on advancing semantic search techniques to improve accuracy and efficiency, involving working with a wide range of novel technologies and collaborating with other teams to integrate your work into our search infrastructure.
Responsibilities
Design, train and improve upon cutting-edge search models.
Gather high-quality retrieval datasets and optimize data pipelines for model training and evaluation.
Work closely with the model serving team to ensure that inference is fast and stable.
Collaborate with product teams to develop solutions.
Engage in research collaborations with our partner organizations and academic affiliations and publish your work in top-tier conferences and journals.
Join us at a pivotal moment, shape what we build, have a strong ownership mindset, and wear multiple hats!
Qualifications
Minimum
Proficiency in Python and related ML frameworks such as PyTorch, Tensorflow, TF-Serving, JAX, and XLA/MLIR.
Familiarity with training and using various information retrieval models.
Experience leveraging Large Language Models as part of training data or evaluation pipelines.
Strong communication and problem-solving skills.
Preferred
Experience building training and/or evaluation datasets for practical use cases.
Proficiency in other programming languages, such as C++ or Golang.
Experience using large-scale distributed training strategies with GPUs.