About the job
As part of our team, you will help to accelerate and optimize our progress in developing multi-modal generative foundation models for multiscale biology. We are open to hiring either a Machine Learning Scientist or Machine Learning Engineer. In either role, you will be an integral part of our multidisciplinary teams building the computational platforms that will enable Altos to achieve its mission. You will partner and collaborate with other Machine Learning Scientists and Engineers across the Institute of Computation to contribute to the Altos research and translation ecosystem. Focussing on designing and building state-of-the-art multimodal foundation models that tackle biological questions and aid in the discovery of novel interventions for aging and disease.
Responsibilities
Machine Learning Engineer
- As a Machine Learning Engineer you will focus on building, deploying and optimizing machine learning models at scale.
- Pre-train and fine-tune large-scale machine learning systems using multimodal biological data and natural language inputs.
- Develop efficient data loading strategy and performance tracking to train large models with distributed training across multiple nodes
- Ability to apply software engineering skills to develop reliable, scalable, performant distributed systems in a cloud environment.
Machine Learning Scientist
- The ideal candidate will be able to use their experience to focus on designing, developing and evaluating state of the art foundation models, at scale, to benefit the research.
- Pre-train and fine-tune large-scale machine learning systems using multimodal biological data and natural language inputs.
- You will be able to gain insights, based on theory, deep research and the mathematical underpinnings of your work.
- Be experienced in using AI as a tool to accelerate your research.
- Can apply strong coding experience to model development using existing languages(s) and framework(s)
Qualifications
Minimum
Machine Learning Engineer
- MS in Computer Science, Statistics, Machine Learning, Artificial Intelligence, or a related discipline
- 0-5 years of relevant work experience in either an academic or industry setting
- Very strong programming skills, including experience with Python and deep learning libraries (PyTorch, Hugging Face Transformers, H-F Datasets, H-F Accelerate)
- Ideally, experience in a distributed training framework, like DDP, FSDP, Deepspeed, Megatron, or HuggingFace Accelerate, Ray.
- Expertise in a subset of the following: transformers, natural language processing, multi-modality in language and/or in biology, diffusion models.
Machine Learning Scientist
- PhD in Computer Science / Machine Learning or similar fields
- 0-5 years of relevant work experience in either an academic or industry setting.
- Prior experience in developing and implementing novel generative AI models in a subset of the following: transformers, multi-modality, diffusion models.
- Can demonstrate a deep understanding and expertise of Machine Learning Principles and how they apply to different models
- Very strong programming skills, including experience with Python and deep learning libraries (PyTorch, Hugging Face Transformers, H-F Datasets, H-F Accelerate)
- Strong track record of published peer reviewed innovative AI/ML research
- Experience writing production-quality code with modern machine learning frameworks such as PyTorch, TensorFlow, JAX, or similar;
- Experience with multi-GPU and distributed training at scale;
Preferred
Familiarity with multimodal data integration, including early and/or late fusion strategies.
Track record of ML applied to NGS data (e.g.RNA-seq, ATAC-seq, ChIP-seq, DNA methylation), biological imaging modalities (e.g. microscopy, H&E, IF), and/or spatial transcriptomics.